BRING THE NOISE! MAKING SENSE OF A HAILSTORM OF METRICS. Abe Jon

Size: px
Start display at page:

Download "BRING THE NOISE! MAKING SENSE OF A HAILSTORM OF METRICS. Abe Jon"

Transcription

1 BRING THE NOISE! MAKING SENSE OF A HAILSTORM OF METRICS Abe Jon

2 Ninety minutes is a long time. This talk: ~10 ~25 ~30 ~10 ~15 - motivations - skyline - oculus - demo! - questions

3 Ninety minutes is a long time. This talk: But we have some sweet stuff to show you. ~10 ~25 ~30 ~10 ~15 - motivations - skyline - oculus - demo! - questions

4 Background and Motivations

5

6 1.5 billion page views $117 million of goods sold 950 thousand users

7 1.5 billion page views $117 million of goods sold 950 thousand users (in december 12)

8 We practice continuous deployment.

9 de ploy /diˈploi/ Verb To release your code for the world to see, hopefully without breaking the Internet

10 250+ committers. Everyone deploys.

11 Day one: DEPLOY

12

13 30+ DEPLOYS A DAY (~8 commits per deploy!)

14 30 deploys a day? Is that safe?

15 We optimize for quick recovery by anticipating problems...

16 ...instead of fearing human error.

17 Can t fix what you don t measure! - W. Edwards Deming

18 not homemade homemade! Ganglia graphite Nagios Skyline StatsD Supergrep Oculus

19 Real time error logging Text

20 Not all things that break throw errors. - Oscar Wilde

21 StatsD

22 StatsD::increment( foo.bar )

23 If it moves, graph it!

24 If it moves, graph it! we would graph them

25 If it doesn t move, graph it anyway (it might make a run for it)

26 DASHBOARDS!

27 [ , 20] [ , 20] [ , 20] [ , 60] [ , 20] [ , 20] [ , 20] [ , 20] [ , 20] [ , 20] [ , 20]

28 DASHBOARDS! x 250,000

29

30 lol nagios

31 ...but there are also unknown unknowns - there are things we do not know we don t know.

32 Unknown anomalies

33 Unknown correlations

34 Kale.

35 Kale: - leaves - green stuff

36 Kale: - leaves SKYLINE - green stuff OCULUS

37 Q). How do you analyze a timeseries for anomalies in real time?

38 A). Lots of HTTP requests to Graphite s API!

39 Q). How do you analyze a quarter million timeseries for anomalies in real time?

40 SKYLINE

41 SKYLINE

42 A real time anomaly detection system

43 Real time?

44 Kinda.

45 StatsD Ten second resolution

46 Ganglia One minute resolution

47 Best case: ~ 10s ( ~ 1min

48 Takes about 90 seconds with our throughput. (

49 Still faster than you would have discovered it otherwise. (

50 Memory > Disk

51

52 Q). How do you get a quarter million timeseries into Redis on time?

53 STREAM IT!

54 Graphite s relay agent original graphite backup graphite

55 Graphite s relay agent [statsd.numstats, [ , 73421]] [statsd.numstats, [ , 82345]] [statsd.numstats, [ , 80611]] pickles original graphite backup graphite

56 Graphite s relay agent [statsd.numstats, [ , 73421]] [statsd.numstats, [ , 82345]] [statsd.numstats, [ , 80611]] pickles original graphite skyline

57 We import from Ganglia too.

58 Storing timeseries

59 Minimize I/O Minimize memory

60 redis.append() - Strings - Constant time - One operation per update

61 JSON?

62 => get statsd.numstats [ , 51],

63 => get statsd.numstats [ , 51], [ , 23],

64 => get statsd.numstats [ , 51], [ , 23], [ , 45],

65 OVER HALF CPU time spent decoding JSON

66 [1,2]

67 Stuff we care about [ 1, 2 ] Extra junk

68 MESSAGEPACK

69 MESSAGEPACK A binary-based serialization protocol

70 Things we care about \x93\x01\x02 Array size (16 or 32 bit big endian integer)

71 Things we care about \x93\x01\x02 \x93\x02\x03 Array size (16 or 32 bit big endian integer)

72 CUT IN HALF Run Time + Memory Used

73 ROOMBA.PY CLEANS THE DATA

74 Wait...you wrote this in Python?

75 Great statistics libraries Not fun for parallelism

76 The Analyzer Assign Redis keys to each process Process decodes and analyzes

77 The Analyzer Anomalous metrics written as JSON setinterval() retrieves from front end

78

79 What does it mean to be anomalous?

80 Consensus model

81 Implement everything you can get your hands on

82 Basic algorithm: A metric is anomalous if its latest datapoint is over three standard deviations above its moving average.

83 Grubb s test, ordinary least squares

84 Histogram binning

85 Four horsemen of the modelpocalypse

86 1. Seasonality 2. Spike influence 3. Normality 4. Parameters

87 Anomaly?

88 Nope.

89 Spikes artificially raise the moving average Bigger moving average Text Anomaly Anomaly missed :( detected (yay!)

90 Real world data doesn t necessarily follow a perfect normal distribution.

91 Too many metrics to fit parameters for them all!

92 A robust set of algorithms is the current focus of this project.

93 Q). How do you analyze a quarter million timeseries for correlations?

94 OCULUS

95 Image comparison is expensive and slow

96 Use raw timeseries instead of raw graphs [[975, ], [643, ], [750, ], [992, ], [580, ], [586, ], [649, ], [548, ], [901, ], [633, ]]

97 HARD PROBLEMS Naming Things Cache Invalidation Numerical Comparison?

98 HARD PROBLEMS Naming Things Cache Invalidation Numerical Comparison?

99 Euclidian Distance

100 Dynamic Time Warping (helps with phase shifts)

101 We ve solved it!

102 O(N 2 )

103 O(N 2 ) x 250k

104 Too slow!

105 doesn t

106 No need to run DTW on all 250k.

107 Discard obviously dissimilar metrics.

108 Shape Description Alphabet sharpdecrement flat increment sharpincrement flat flat shapdecrement

109 Shape Description Alphabet (normalization step) sharpdecrement flat increment sharpincrement flat flat shapdecrement

110

111 Search for shape description fingerprint in Elasticsearch

112 Run DTW on results as final polish

113 O(N 2 ) on ~10k metrics

114 Still too slow.

115 Fast DTW - O(N) coarsen project refine

116 Elasticsearch Details Phrase search for first pass scores across shape description fingerprints

117 Elasticsearch Details Phrase search for first pass scores across shape description fingerprints Custom FastDTW and euclidian distance plugins to score across the remaining filtered timeseries

118 Elasticsearch Structure { :id => statsd.numstats, :fingerprint => sdec inc sinc sdec, :values => " " }

119 Mappings Specify tokenizers Untouched fields

120 First pass query :match => { :fingerprint => { :query => sdec inc sinc sdec inc, :type => "phrase", } } :slop => 20 shape description fingerprint

121 Refinement query {:custom_score => { :query => <first_pass_query>, :script => "oculus_dtw", raw timeseries :params => { :query_value => , :query_field => "values.untouched", }, }

122 KALE StatsD Graphite Ganglia Elasticsearch Skyline Flask Sinatra Resque

123 Populating Elasticsearch

124 resque workers ES Index

125 Too slow to update and search

126 Webapp New Index Last Index

127 Sinatra frontend Queries ES Renders results

128 Collections

129 devops <3

130

131 Special thanks to: Dr. Neil Gunther, PerfDynamics Dr. Brian Whitman, Echonest Burc Arpat, Facebook Seth Walker, Etsy Rafe Colburn, Etsy Mike Rembetsy, Etsy John Allspaw, Etsy

132 Thanks!

Prometheus. A Next Generation Monitoring System. Brian Brazil Founder

Prometheus. A Next Generation Monitoring System. Brian Brazil Founder Prometheus A Next Generation Monitoring System Brian Brazil Founder Who am I? Engineer passionate about running software reliably in production. Based in Ireland Core-Prometheus developer Contributor to

More information

Time Series Live 2017

Time Series Live 2017 1 Time Series Schemas @Percona Live 2017 Who Am I? Chris Larsen Maintainer and author for OpenTSDB since 2013 Software Engineer @ Yahoo Central Monitoring Team Who I m not: A marketer A sales person 2

More information

Fully Optimize FULLY OPTIMIZE YOUR DBA RESOURCES

Fully Optimize FULLY OPTIMIZE YOUR DBA RESOURCES Fully Optimize FULLY OPTIMIZE YOUR DBA RESOURCES IMPROVE SERVER PERFORMANCE, UPTIME, AND AVAILABILITY WHILE LOWERING COSTS WE LL COVER THESE TOP WAYS TO OPTIMIZE YOUR RESOURCES: 1 Be Smart About Your Wait

More information

Monitoring MySQL with Prometheus & Grafana

Monitoring MySQL with Prometheus & Grafana Monitoring MySQL with Prometheus & Grafana Julien Pivotto (@roidelapluie) Percona University Belgium June 22nd, 2017 SELECT USER(); Julien "roidelapluie" Pivotto @roidelapluie Sysadmin at inuits Automation,

More information

SQLite vs. MongoDB for Big Data

SQLite vs. MongoDB for Big Data SQLite vs. MongoDB for Big Data In my latest tutorial I walked readers through a Python script designed to download tweets by a set of Twitter users and insert them into an SQLite database. In this post

More information

FAST, FLEXIBLE, RELIABLE SEAMLESSLY ROUTING AND SECURING BILLIONS OF REQUESTS PER MONTH

FAST, FLEXIBLE, RELIABLE SEAMLESSLY ROUTING AND SECURING BILLIONS OF REQUESTS PER MONTH We help Big Brands, Scale WordPress. WORDPRESS HOSTING MANAGED BY PROFESSIONALS PAGELY, INC pagely.com THE PAGELY ARES APPLICATION GATEWAY FAST, FLEXIBLE, RELIABLE SEAMLESSLY ROUTING AND SECURING BILLIONS

More information

Evolution of the "Web

Evolution of the Web Evolution of the "Web App" @HenrikJoreteg @Hoarse_JS THIS USED TO BE SIMPLE! 1. WRITE SOME HTML 2. LAY IT OUT WITH FRAMES OR TABLES 3. FTP IT TO A SERVER! 4. BAM! CONGRATULATIONS, YOU RE A WEB DEVELOPER!

More information

Trending with Purpose. Jason Dixon

Trending with Purpose. Jason Dixon Trending with Purpose Jason Dixon Monitoring Nagios Fault Detection Notifications Escalations Acknowledgements/Downtime http://www.nagios.org/ Nagios Pros Free Extensible Plugins Configuration templates

More information

Storing metrics at scale with. Gnocchi. Julien Danjou OpenStack Day France 22 November 2016

Storing metrics at scale with. Gnocchi. Julien Danjou OpenStack Day France 22 November 2016 Storing metrics at scale with Gnocchi Julien Danjou OpenStack Day France 22 November 2016 Hello! I am Julien Danjou Principal Software Engineer at Red Hat You can find me at @juldanjou 1 What s the problem?

More information

Application monitoring with BELK. Nishant Sahay, Sr. Architect Bhavani Ananth, Architect

Application monitoring with BELK. Nishant Sahay, Sr. Architect Bhavani Ananth, Architect Application monitoring with BELK Nishant Sahay, Sr. Architect Bhavani Ananth, Architect Why logs Business PoV Input Data Analytics User Interactions /Behavior End user Experience/ Improvements 2017 Wipro

More information

Evolving Prometheus for the Cloud Native World. Brian Brazil Founder

Evolving Prometheus for the Cloud Native World. Brian Brazil Founder Evolving Prometheus for the Cloud Native World Brian Brazil Founder Who am I? Engineer passionate about running software reliably in production. Core developer of Prometheus Studied Computer Science in

More information

Monitor your containers with the Elastic Stack. Monica Sarbu

Monitor your containers with the Elastic Stack. Monica Sarbu Monitor your containers with the Elastic Stack Monica Sarbu Monica Sarbu Team lead, Beats team monica@elastic.co 3 Monitor your containers with the Elastic Stack Elastic Stack 5 Beats are lightweight shippers

More information

Elasticsearch & ATLAS Data Management. European Organization for Nuclear Research (CERN)

Elasticsearch & ATLAS Data Management. European Organization for Nuclear Research (CERN) Elasticsearch & ATAS Data Management European Organization for Nuclear Research (CERN) ralph.vigne@cern.ch mario.lassnig@cern.ch ATAS Analytics Platform proposed eb. 2015; work in progress; correlate data

More information

The Art of Container Monitoring. Derek Chen

The Art of Container Monitoring. Derek Chen The Art of Container Monitoring Derek Chen 2016.9.22 About me DevOps Engineer at Trend Micro Agile transformation Micro service and cloud service Docker integration Monitoring system development Automate

More information

opentsdb - Metrics for a distributed world Oliver Hankeln /

opentsdb - Metrics for a distributed world Oliver Hankeln / opentsdb - Metrics for a distributed world Oliver Hankeln / gutefrage.net @mydalon Who am I? Senior Engineer - Data and Infrastructure at gutefrage.net GmbH Was doing software development before DevOps

More information

Monitor your infrastructure with the Elastic Beats. Monica Sarbu

Monitor your infrastructure with the Elastic Beats. Monica Sarbu Monitor your infrastructure with the Elastic Beats Monica Sarbu Monica Sarbu Team lead, Beats team Email: monica@elastic.co Twitter: 2 Monitor your servers Apache logs 3 Monitor your servers Apache logs

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

Using Redis As a Time Series Database

Using Redis As a Time Series Database WHITE PAPER Using Redis As a Time Series Database Dr.Josiah Carlson, Author of Redis in Action CONTENTS Executive Summary 2 Use Cases 2 Advanced Analysis Using a Sorted Set with Hashes 2 Event Analysis

More information

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino Monitoring system for geographically distributed datacenters based on Openstack Gioacchino Vino Tutor: Dott. Domenico Elia Tutor: Dott. Giacinto Donvito Borsa di studio GARR Orio Carlini 2016-2017 INFN

More information

Identifying Workloads for the Cloud

Identifying Workloads for the Cloud Identifying Workloads for the Cloud 1 This brief is based on a webinar in RightScale s I m in the Cloud Now What? series. Browse our entire library for webinars on cloud computing management. Meet our

More information

whitepaper Using Redis As a Time Series Database: Why and How

whitepaper Using Redis As a Time Series Database: Why and How whitepaper Using Redis As a Time Series Database: Why and How Author: Dr.Josiah Carlson, Author of Redis in Action Table of Contents Executive Summary 2 A Note on Race Conditions and Transactions 2 Use

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule. CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your

More information

Monitoring MySQL Performance with Percona Monitoring and Management

Monitoring MySQL Performance with Percona Monitoring and Management Monitoring MySQL Performance with Percona Monitoring and Management Santa Clara, California April 23th 25th, 2018 MIchael Coburn, Product Manager Your Presenter Product Manager for PMM (also Percona Toolkit

More information

Visualize Your Data With Grafana Percona Live Daniel Lee - Software Engineer at Grafana Labs

Visualize Your Data With Grafana Percona Live Daniel Lee - Software Engineer at Grafana Labs Visualize Your Data With Grafana Percona Live 2017 Daniel Lee - Software Engineer at Grafana Labs Daniel Lee Software Engineer at Grafana Labs Stockholm, Sweden @danlimerick on Twitter What is Grafana?

More information

ReJSON = { "activity": "new trick" } Itamar

ReJSON = { activity: new trick } Itamar ReJSON = { "id": "old dog", "activity": "new trick" } Itamar Haber @itamarhaber What do Chuck Norris, JSON & Redis have in common? They're everywhere. "Any application that can be written in JavaScript,

More information

Scalable Time Series in PCP. Lukas Berk

Scalable Time Series in PCP. Lukas Berk Scalable Time Series in PCP Lukas Berk Summary Problem Statement Proposed Solution Redis Basic Types Summary Current Work Future Work Items Problem Statement Scaling PCP s metrics querying to hundreds/thousands

More information

JAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc.

JAVASCRIPT CHARTING. Scaling for the Enterprise with Metric Insights Copyright Metric insights, Inc. JAVASCRIPT CHARTING Scaling for the Enterprise with Metric Insights 2013 Copyright Metric insights, Inc. A REVOLUTION IS HAPPENING... 3! Challenges... 3! Borrowing From The Enterprise BI Stack... 4! Visualization

More information

Unlimited Scalability in the Cloud A Case Study of Migration to Amazon DynamoDB

Unlimited Scalability in the Cloud A Case Study of Migration to Amazon DynamoDB Unlimited Scalability in the Cloud A Case Study of Migration to Amazon DynamoDB Steve Saporta CTO, SpinCar Mar 19, 2016 SpinCar When a web-based business grows... More customers = more transactions More

More information

@InfluxDB. David Norton 1 / 69

@InfluxDB. David Norton  1 / 69 @InfluxDB David Norton (@dgnorton) david@influxdb.com 1 / 69 Instrumenting a Data Center 2 / 69 3 / 69 4 / 69 The problem: Efficiently monitor hundreds or thousands of servers 5 / 69 The solution: Automate

More information

Manage MySQL like a devops sysadmin. Frédéric Descamps

Manage MySQL like a devops sysadmin. Frédéric Descamps Manage MySQL like a devops sysadmin Frédéric Descamps Webinar Oct 2012 Who am I? Frédéric Descamps @lefred http://about.be/lefred Managing MySQL since 3.23 (as far as I remember) devops believer www.percona.com

More information

CLIENT SERVER ARCHITECTURE:

CLIENT SERVER ARCHITECTURE: CLIENT SERVER ARCHITECTURE: Client-Server architecture is an architectural deployment style that describe the separation of functionality into layers with each segment being a tier that can be located

More information

RavenDB & document stores

RavenDB & document stores université libre de bruxelles INFO-H415 - Advanced Databases RavenDB & document stores Authors: Yasin Arslan Jacky Trinh Professor: Esteban Zimányi Contents 1 Introduction 3 1.1 Présentation...................................

More information

Index Construction. Dictionary, postings, scalable indexing, dynamic indexing. Web Search

Index Construction. Dictionary, postings, scalable indexing, dynamic indexing. Web Search Index Construction Dictionary, postings, scalable indexing, dynamic indexing Web Search 1 Overview Indexes Query Indexing Ranking Results Application Documents User Information analysis Query processing

More information

Getting Started User s Guide

Getting Started User s Guide Getting Started User s Guide Savision iq V2.3 Contents 1. Introduction... 4 1.1 About this Guide... 4 1.2 Understanding Savision iq... 4 2. First Run Experience... 4 2.1 Adding the License Key... 5 2.2

More information

iems Interactive Experiment Management System Final Report

iems Interactive Experiment Management System Final Report iems Interactive Experiment Management System Final Report Pēteris Ņikiforovs Introduction Interactive Experiment Management System (Interactive EMS or iems) is an experiment management system with a graphical

More information

Wrangling Logs with Logstash and ElasticSearch

Wrangling Logs with Logstash and ElasticSearch Wrangling Logs with Logstash and ElasticSearch Nate Jones & David Castro Media Temple OSCON 2012 Why are we here? Size Quantity Efficiency Access Locality Method Filtering Grokability Noise Structure Metrics

More information

How to store millions metrics per second. Vladimir Smirnov System Administrator. SREcon17 Asia/Australia 22 May 2017

How to store millions metrics per second. Vladimir Smirnov System Administrator. SREcon17 Asia/Australia 22 May 2017 Graphite@Scale: How to store millions metrics per second Vladimir Smirnov System Administrator SREcon17 Asia/Australia 22 May 2017 Why you might need to store your metrics? Most common cases: Capacity

More information

Monitoring Java in Docker at CDK

Monitoring Java in Docker at CDK CASE STUDY Monitoring Java in Docker at CDK The Digital Marketing business unit of CDK global shifted to a containerized approach for their next generation infrastructure. One of the challenges they ran

More information

From 1 to 10K with Ganglia and Nagios. Spike Morelli aka Space Linden

From 1 to 10K with Ganglia and Nagios. Spike Morelli aka Space Linden From 1 to 10K with Ganglia and Nagios Spike Morelli aka Space Linden About Second Life 3D Virtual World Not a game About Second Life Built by Residents Textured Scripted Animated Owned About Second Life

More information

Panoptes: A Network Telemetry Ecosystem - Part Deux

Panoptes: A Network Telemetry Ecosystem - Part Deux Panoptes: A Network Telemetry Ecosystem - Part Deux Panoptes is: Greenfield Python based network telemetry platform that provides real time telemetry and analytics @ Yahoo Implements discovery, polling,

More information

What is a graph database?

What is a graph database? What is a graph database? A graph database is a data store that has been optimized for highly connected data. Storing connected data in a flat tabular format is time and resource intensive, usually requiring

More information

Efficient and Scalable Friend Recommendations

Efficient and Scalable Friend Recommendations Efficient and Scalable Friend Recommendations Comparing Traditional and Graph-Processing Approaches Nicholas Tietz Software Engineer at GraphSQL nicholas@graphsql.com January 13, 2014 1 Introduction 2

More information

Putting together the platform: Riak, Redis, Solr and Spark. Bryan Hunt

Putting together the platform: Riak, Redis, Solr and Spark. Bryan Hunt Putting together the platform: Riak, Redis, Solr and Spark Bryan Hunt 1 $ whoami Bryan Hunt Client Services Engineer @binarytemple 2 Minimum viable product - the ideologically correct doctrine 1. Start

More information

Firefox Crash Reporting.

Firefox Crash Reporting. Firefox Crash Reporting laura@ mozilla.com @lxt Webtools @ Mozilla Crash reporting Localization Performance measurement Code search and static analysis Other stuff: product delivery and updates, plugins

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

Maintaining Spatial Data Infrastructures (SDIs) using distributed task queues

Maintaining Spatial Data Infrastructures (SDIs) using distributed task queues 2017 FOSS4G Boston Maintaining Spatial Data Infrastructures (SDIs) using distributed task queues Paolo Corti and Ben Lewis Harvard Center for Geographic Analysis Background Harvard Center for Geographic

More information

Dynatrace FastPack for Liferay DXP

Dynatrace FastPack for Liferay DXP Dynatrace FastPack for Liferay DXP The Dynatrace FastPack for Liferay Digital Experience Platform provides a preconfigured Dynatrace profile custom tailored to Liferay DXP environments. This FastPack contains

More information

How APEXBlogs was built

How APEXBlogs was built How APEXBlogs was built By Dimitri Gielis, APEX Evangelists Copyright 2011 Apex Evangelists apex-evangelists.com How APEXBlogs was built By Dimitri Gielis This article describes how and why APEXBlogs was

More information

Enabling Performance & Stress Test throughout the Application Lifecycle

Enabling Performance & Stress Test throughout the Application Lifecycle Enabling Performance & Stress Test throughout the Application Lifecycle March 2010 Poor application performance costs companies millions of dollars and their reputation every year. The simple challenge

More information

Monitoring MySQL Performance with Percona Monitoring and Management

Monitoring MySQL Performance with Percona Monitoring and Management Monitoring MySQL Performance with Percona Monitoring and Management Your Presenters Michael Coburn - PMM Product Manager Working at Percona for almost 5 years Consultant, Manager, TAM, now Product Manager

More information

Effecient monitoring with Open source tools. Osman Ungur, github.com/o

Effecient monitoring with Open source tools. Osman Ungur, github.com/o Effecient monitoring with Open source tools Osman Ungur, github.com/o Who i am? software developer with system-administration background over 10 years mostly writes Java and PHP also working about infrastructure

More information

Using PostgreSQL in Tantan - From 0 to 350bn rows in 2 years

Using PostgreSQL in Tantan - From 0 to 350bn rows in 2 years Using PostgreSQL in Tantan - From 0 to 350bn rows in 2 years Victor Blomqvist vb@viblo.se Tantan ( 探探 ) December 2, PGConf Asia 2016 in Tokyo tantanapp.com 1 Sweden - Tantan - Tokyo 10 Million 11 Million

More information

Anomaly Detection Fault Tolerance Anticipation

Anomaly Detection Fault Tolerance Anticipation Anomaly Detection Fault Tolerance Anticipation Patterns John Allspaw SVP, Tech Ops Qcon London 2012 Four Cornerstones Erik Hollnagel (Anticipation) (Response) Knowing Knowing Knowing Knowing What What

More information

Huge Codebases Application Monitoring with Hystrix

Huge Codebases Application Monitoring with Hystrix Huge Codebases Application Monitoring with Hystrix 30 Jan. 2016 Roman Mohr Red Hat FOSDEM 2016 1 About Me Roman Mohr Software Engineer at Red Hat Member of the SLA team in ovirt Mail: rmohr@redhat.com

More information

Site Speed: To Measure Is To Know. Sava Sertov QA Technical Lead ecommera

Site Speed: To Measure Is To Know. Sava Sertov QA Technical Lead ecommera Site Speed: To Measure Is To Know Sava Sertov QA Technical Lead ecommera We want to be faster than our competitors "80-90% of the end-user response time is spent on the front-end. Start there. Someone

More information

CSCE 120: Learning To Code

CSCE 120: Learning To Code CSCE 120: Learning To Code Module 11.0: Consuming Data I Introduction to Ajax This module is designed to familiarize you with web services and web APIs and how to connect to such services and consume and

More information

Building a Kubernetes on Bare-Metal Cluster to Serve Wikipedia. Alexandros Kosiaris Giuseppe Lavagetto

Building a Kubernetes on Bare-Metal Cluster to Serve Wikipedia. Alexandros Kosiaris Giuseppe Lavagetto Building a Kubernetes on Bare-Metal Cluster to Serve Wikipedia Alexandros Kosiaris Giuseppe Lavagetto Introduction The Wikimedia Foundation is the organization running the infrastructure supporting Wikipedia

More information

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced?

!! What is virtual memory and when is it useful? !! What is demand paging? !! When should pages in memory be replaced? Chapter 10: Virtual Memory Questions? CSCI [4 6] 730 Operating Systems Virtual Memory!! What is virtual memory and when is it useful?!! What is demand paging?!! When should pages in memory be replaced?!!

More information

OpenNTI Collect and visualize KPI from Networks devices

OpenNTI Collect and visualize KPI from Networks devices OpenNTI Collect and visualize KPI from Networks devices Open Network Telemetry Insights Efrain Gonzalez (efrain@juniper.net) Pablo Sagrera (psagrera@juniper.net) Version 3.0 / Oct 2017 OpenNTI / Dashboard

More information

Media-Ready Network Transcript

Media-Ready Network Transcript Media-Ready Network Transcript Hello and welcome to this Cisco on Cisco Seminar. I m Bob Scarbrough, Cisco IT manager on the Cisco on Cisco team. With me today are Sheila Jordan, Vice President of the

More information

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi.

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi. Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 18 Tries Today we are going to be talking about another data

More information

A Guide to Finding the Best WordPress Backup Plugin: 10 Must-Have Features

A Guide to Finding the Best WordPress Backup Plugin: 10 Must-Have Features A Guide to Finding the Best WordPress Backup Plugin: 10 Must-Have Features \ H ow do you know if you re choosing the best WordPress backup plugin when it seems that all the plugins seem to do the same

More information

CO Computer Architecture and Programming Languages CAPL. Lecture 15

CO Computer Architecture and Programming Languages CAPL. Lecture 15 CO20-320241 Computer Architecture and Programming Languages CAPL Lecture 15 Dr. Kinga Lipskoch Fall 2017 How to Compute a Binary Float Decimal fraction: 8.703125 Integral part: 8 1000 Fraction part: 0.703125

More information

DISQUS. Continuous Deployment Everything. David

DISQUS. Continuous Deployment Everything. David DISQUS Continuous Deployment Everything David Cramer @zeeg Continuous Deployment Shipping new code as soon as it s ready (It s really just super awesome buildbots) Workflow Commit (master) Integration

More information

How to set up SQL Source Control The short guide for evaluators

How to set up SQL Source Control The short guide for evaluators GUIDE How to set up SQL Source Control The short guide for evaluators 1 Contents Introduction Team Foundation Server & Subversion setup Git setup Setup without a source control system Making your first

More information

Regain control thanks to Prometheus. Guillaume Lefevre, DevOps Engineer, OCTO Technology Etienne Coutaud, DevOps Engineer, OCTO Technology

Regain control thanks to Prometheus. Guillaume Lefevre, DevOps Engineer, OCTO Technology Etienne Coutaud, DevOps Engineer, OCTO Technology Regain control thanks to Prometheus Guillaume Lefevre, DevOps Engineer, OCTO Technology Etienne Coutaud, DevOps Engineer, OCTO Technology About us Guillaume Lefevre DevOps Engineer, OCTO Technology @guillaumelfv

More information

Open Source Database Performance Optimization and Monitoring with PMM. Fernando Laudares, Vinicius Grippa, Michael Coburn Percona

Open Source Database Performance Optimization and Monitoring with PMM. Fernando Laudares, Vinicius Grippa, Michael Coburn Percona Open Source Database Performance Optimization and Monitoring with PMM Fernando Laudares, Vinicius Grippa, Michael Coburn Percona Fernando Laudares 2 Vinicius Grippa 3 Michael Coburn Product Manager for

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide

SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide SSIM Collection & Archiving Infrastructure Scaling & Performance Tuning Guide April 2013 SSIM Engineering Team Version 3.0 1 Document revision history Date Revision Description of Change Originator 03/20/2013

More information

The Attraction of Complexity

The Attraction of Complexity The Attraction of Complexity Carlo Bottiglieri December 10, 2017 1 Introduction How is complexity distributed through a codebase? Does this distribution present similarities across different projects?

More information

CleanMyPC User Guide

CleanMyPC User Guide CleanMyPC User Guide Copyright 2017 MacPaw Inc. All rights reserved. macpaw.com CONTENTS Overview 3 About CleanMyPC... 3 System requirements... 3 Download and installation 4 Activation and license reset

More information

Java Without the Jitter

Java Without the Jitter TECHNOLOGY WHITE PAPER Achieving Ultra-Low Latency Table of Contents Executive Summary... 3 Introduction... 4 Why Java Pauses Can t Be Tuned Away.... 5 Modern Servers Have Huge Capacities Why Hasn t Latency

More information

Amazon Elasticsearch Service

Amazon Elasticsearch Service Amazon Elasticsearch Service Fully managed, reliable, and scalable Elasticsearch service. Have Your Frontend & Monitor It Too Scalable Log Analytics Inside a VPC Lab Instructions Contents Lab Overview...

More information

Operational Efficiency Hacks. John Allspaw Operations Engineering, Flickr

Operational Efficiency Hacks. John Allspaw Operations Engineering, Flickr Operational Efficiency Hacks John Allspaw Operations Engineering, Flickr who am I? Manage the Flickr Operations group Wrote a geeky book: Efficiencies Efficiencies Doing more with the robots you ve got

More information

2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or

2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The

More information

Distributed Systems. 27. Engineering Distributed Systems. Paul Krzyzanowski. Rutgers University. Fall 2018

Distributed Systems. 27. Engineering Distributed Systems. Paul Krzyzanowski. Rutgers University. Fall 2018 Distributed Systems 27. Engineering Distributed Systems Paul Krzyzanowski Rutgers University Fall 2018 1 We need distributed systems We often have a lot of data to ingest, process, and/or store The data

More information

Top 20 Data Quality Solutions for Data Science

Top 20 Data Quality Solutions for Data Science Top 20 Data Quality Solutions for Data Science Data Science & Business Analytics Meetup Boulder, CO 2014-12-03 Ken Farmer DQ Problems for Data Science Loom Large & Frequently 4000000 Strikingly visible

More information

We deliver the cure for managing infrastructure pain.

We deliver the cure for managing infrastructure pain. CUSTOMER CASE STUDY We deliver the cure for managing infrastructure pain. Being a technology shop touting cutting-edge software platforms, we wanted to have cutting-edge infrastructure. SolidFire offered

More information

Introduction to Information Retrieval (Manning, Raghavan, Schutze)

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 3 Dictionaries and Tolerant retrieval Chapter 4 Index construction Chapter 5 Index compression Content Dictionary data structures

More information

CACHE-OBLIVIOUS MAPS. Edward Kmett McGraw Hill Financial. Saturday, October 26, 13

CACHE-OBLIVIOUS MAPS. Edward Kmett McGraw Hill Financial. Saturday, October 26, 13 CACHE-OBLIVIOUS MAPS Edward Kmett McGraw Hill Financial CACHE-OBLIVIOUS MAPS Edward Kmett McGraw Hill Financial CACHE-OBLIVIOUS MAPS Indexing and Machine Models Cache-Oblivious Lookahead Arrays Amortization

More information

Defending the Gibson in 2015

Defending the Gibson in 2015 Incident Response: Defending the Gibson in 2015 Darren Bilby - Digital Janitor dbilby@google.com ACSC 2015, Canberra Incidents are Messy If it were business as usual you would have stopped it Attacker

More information

Clustering Documents. Document Retrieval. Case Study 2: Document Retrieval

Clustering Documents. Document Retrieval. Case Study 2: Document Retrieval Case Study 2: Document Retrieval Clustering Documents Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April, 2017 Sham Kakade 2017 1 Document Retrieval n Goal: Retrieve

More information

CacheControl Documentation

CacheControl Documentation CacheControl Documentation Release 0.12.4 Eric Larson May 01, 2018 Contents 1 Install 3 2 Quick Start 5 3 Tests 7 4 Disclaimers 9 4.1 Using CacheControl........................................... 9 4.2

More information

Disk to Disk Data File Backup and Restore.

Disk to Disk Data File Backup and Restore. Disk to Disk Data File Backup and Restore. Implementation Variations and Advantages with Tivoli Storage Manager and Tivoli SANergy software Dimitri Chernyshov September 26th, 2001 Data backup procedure.

More information

Fixing Twitter.... and Finding your own Fail Whale. John Adams Twitter Operations

Fixing Twitter.... and Finding your own Fail Whale. John Adams Twitter Operations Fixing Twitter... and Finding your own Fail Whale John Adams Twitter Operations Operations Small team, growing rapidly. What do we do? Software Performance (back-end) Availability Capacity

More information

Accenture Cloud Platform Serverless Journey

Accenture Cloud Platform Serverless Journey ARC202 Accenture Cloud Platform Serverless Journey Tom Myers, Sr. Cloud Architect, Accenture Cloud Platform Matt Lancaster, Lightweight Architectures Global Lead November 29, 2016 2016, Amazon Web Services,

More information

Last Class: Demand Paged Virtual Memory

Last Class: Demand Paged Virtual Memory Last Class: Demand Paged Virtual Memory Benefits of demand paging: Virtual address space can be larger than physical address space. Processes can run without being fully loaded into memory. Processes start

More information

The story of Greendale. Turbinia: Automation of forensic processing in the cloud

The story of Greendale. Turbinia: Automation of forensic processing in the cloud The story of Greendale Turbinia: Automation of forensic processing in the cloud Why are WE here? Thomas Chopitea @tomchop_ Aaron Peterson @aarontpeterson DFIR @ Google We write code, we use it to hunt

More information

ThinkinG outside The box - =

ThinkinG outside The box - = ThinkinG outside The box - = Hello, I'm Armin! I do Computers - with Python. Currently at Fireteam / Splash Damage. We do Internet for Pointy Shooty Games. c w j t q t j d X the box is comfortable l the

More information

DESIGNING APPLICATIONS FOR CONTAINERIZATION AND THE CLOUD THE 12 FACTOR APPLICATION MANIFESTO

DESIGNING APPLICATIONS FOR CONTAINERIZATION AND THE CLOUD THE 12 FACTOR APPLICATION MANIFESTO DESIGNING APPLICATIONS FOR CONTAINERIZATION AND THE CLOUD THE 12 FACTOR APPLICATION MANIFESTO THIS IS THE DEV PART DESIGNING OUR APPLICATIONS TO BE PREDICTABLE, FLEXIBLE, RELIABLE, SCALABLE AND COMPLETELY

More information

Ext3/4 file systems. Don Porter CSE 506

Ext3/4 file systems. Don Porter CSE 506 Ext3/4 file systems Don Porter CSE 506 Logical Diagram Binary Formats Memory Allocators System Calls Threads User Today s Lecture Kernel RCU File System Networking Sync Memory Management Device Drivers

More information

Distributed CI: Scaling Jenkins on Mesos and Marathon. Roger Ignazio Puppet Labs, Inc. MesosCon 2015 Seattle, WA

Distributed CI: Scaling Jenkins on Mesos and Marathon. Roger Ignazio Puppet Labs, Inc. MesosCon 2015 Seattle, WA Distributed CI: Scaling Jenkins on Mesos and Marathon Roger Ignazio Puppet Labs, Inc. MesosCon 2015 Seattle, WA About Me Roger Ignazio QE Automation Engineer Puppet Labs, Inc. @rogerignazio Mesos In Action

More information

The Evolution of a Data Project

The Evolution of a Data Project The Evolution of a Data Project The Evolution of a Data Project Python script The Evolution of a Data Project Python script SQL on live DB The Evolution of a Data Project Python script SQL on live DB SQL

More information

I heard you like tiles Michal Migurski, Geomeetup April 2013

I heard you like tiles Michal Migurski, Geomeetup April 2013 I heard you like tiles Michal Migurski, Geomeetup April 2013 so I put some vectors in your tiles so you could tile while you vector. Why? Using OpenStreetMap should be as easy as pasting a URL. OSM is

More information

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 15: Caching: Demand Paged Virtual Memory

CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring Lecture 15: Caching: Demand Paged Virtual Memory CS 162 Operating Systems and Systems Programming Professor: Anthony D. Joseph Spring 2003 Lecture 15: Caching: Demand Paged Virtual Memory 15.0 Main Points: Concept of paging to disk Replacement policies

More information

Graph and Timeseries Databases

Graph and Timeseries Databases Graph and Timeseries Databases Roman Kern ISDS, TU Graz 2017-10-23 Roman Kern (ISDS, TU Graz) Dbase2 2017-10-23 1 / 31 Graph Databases Graph Databases Motivation and Basics of Graph Databases? Roman Kern

More information

DUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING

DUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING DUPLICATE DETECTION AND AUDIO THUMBNAILS WITH AUDIO FINGERPRINTING Christopher Burges, Daniel Plastina, John Platt, Erin Renshaw, and Henrique Malvar March 24 Technical Report MSR-TR-24-19 Audio fingerprinting

More information

CIO 24/7 Podcast: Tapping into Accenture s rich content with a new search capability

CIO 24/7 Podcast: Tapping into Accenture s rich content with a new search capability CIO 24/7 Podcast: Tapping into Accenture s rich content with a new search capability CIO 24/7 Podcast: Tapping into Accenture s rich content with a new search capability Featuring Accenture managing directors

More information

The Boundary Graph Supervised Learning Algorithm for Regression and Classification

The Boundary Graph Supervised Learning Algorithm for Regression and Classification The Boundary Graph Supervised Learning Algorithm for Regression and Classification! Jonathan Yedidia! Disney Research!! Outline Motivation Illustration using a toy classification problem Some simple refinements

More information

How Rendering is Killing Your Scalability

How Rendering is Killing Your Scalability How Rendering is Killing Your Scalability James Pulley Chief Geek, Host PerfBytes Chief Geek - LiteSquare Moderator to a half a dozen forums on Performance Testing and Engineering mailto:jpulley@litesquare.com

More information