Time Series Live 2017

Similar documents
Who Am I? Chris Larsen

@InfluxDB. David Norton 1 / 69

opentsdb - Metrics for a distributed world Oliver Hankeln /

Introduction to NoSQL Databases

Druid Power Interactive Applications at Scale. Jonathan Wei Software Engineer

Inside the InfluxDB Storage Engine

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

HBase Solutions at Facebook

New Data Architectures For Netflow Analytics NANOG 74. Fangjin Yang - Imply

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Apache Hadoop Goes Realtime at Facebook. Himanshu Sharma

How we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016

Ghislain Fourny. Big Data 5. Column stores

Flexible Network Analytics in the Cloud. Jon Dugan & Peter Murphy ESnet Software Engineering Group October 18, 2017 TechEx 2017, San Francisco

App Engine: Datastore Introduction

Provide Real-Time Data To Financial Applications

Ghislain Fourny. Big Data 5. Wide column stores

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

Big Data Analytics. Rasoul Karimi

MIXPANEL SYSTEM ARCHITECTURE

High-Performance Distributed DBMS for Analytics

Bigtable. A Distributed Storage System for Structured Data. Presenter: Yunming Zhang Conglong Li. Saturday, September 21, 13

Hive and Shark. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

Chronix A fast and efficient time series storage based on Apache Solr. Caution: Contains technical content.

Beyond Relational Databases: MongoDB, Redis & ClickHouse. Marcos Albe - Principal Support Percona

Introduction to Hadoop. Owen O Malley Yahoo!, Grid Team

Oracle NoSQL Database at OOW 2017

Effecient monitoring with Open source tools. Osman Ungur, github.com/o

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Typical size of data you deal with on a daily basis

LazyBase: Trading freshness and performance in a scalable database

Database Architectures

Data pipelines with PostgreSQL & Kafka

Using Prometheus with InfluxDB for metrics storage

Search Engines and Time Series Databases

Monitoring and Analytics With HTCondor Data

Map-Reduce. Marco Mura 2010 March, 31th

Time-Series Data in MongoDB on a Budget. Peter Schwaller Senior Director Server Engineering, Percona Santa Clara, California April 23th 25th, 2018

CS November 2018

CISC 7610 Lecture 2b The beginnings of NoSQL

CS November 2017

Cassandra, MongoDB, and HBase. Cassandra, MongoDB, and HBase. I have chosen these three due to their recent

ΕΠΛ 602:Foundations of Internet Technologies. Cloud Computing

Safe Harbor Statement

Data Informatics. Seon Ho Kim, Ph.D.

Scaling for Humongous amounts of data with MongoDB

OLAP Introduction and Overview

BigTable. Chubby. BigTable. Chubby. Why Chubby? How to do consensus as a service

Distributed File Systems II

Evolution of Database Systems

Monitor your containers with the Elastic Stack. Monica Sarbu

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes?

Facebook. The Technology Behind Messages (and more ) Kannan Muthukkaruppan Software Engineer, Facebook. March 11, 2011

The State of Apache HBase. Michael Stack

Panoptes: A Network Telemetry Ecosystem - Part Deux

SQL, NoSQL, MongoDB. CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden

TiDB: NewSQL over HBase.

Data-Intensive Distributed Computing

Advanced Database Technologies NoSQL: Not only SQL

CPSC 426/526. Cloud Computing. Ennan Zhai. Computer Science Department Yale University

Putting together the platform: Riak, Redis, Solr and Spark. Bryan Hunt

1

Search and Time Series Databases

Run your own Open source. (MMS) to avoid vendor lock-in. David Murphy MongoDB Practice Manager, Percona

/ Cloud Computing. Recitation 7 October 10, 2017

Google Cloud Bigtable. And what it's awesome at

Evolution of Big Data Facebook. Architecture Summit, Shenzhen, August 2012 Ashish Thusoo

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET

Monitor your infrastructure with the Elastic Beats. Monica Sarbu

10. Replication. Motivation

State of the Dolphin Developing new Apps in MySQL 8

Designing dashboards for performance. Reference deck

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

Bigtable: A Distributed Storage System for Structured Data. Andrew Hon, Phyllis Lau, Justin Ng

BigTable: A Distributed Storage System for Structured Data

Datacenter replication solution with quasardb

MySQL Performance Optimization and Troubleshooting with PMM. Peter Zaitsev, CEO, Percona

Monitoring MySQL with Prometheus & Grafana

CrateDB for Time Series. How CrateDB compares to specialized time series data stores

Outline. Spanner Mo/va/on. Tom Anderson

10 Million Smart Meter Data with Apache HBase

Axibase Time-Series Database. Non-relational database for storing and analyzing large volumes of metrics collected at high-frequency

The Right Read Optimization is Actually Write Optimization. Leif Walsh

How do we build TiDB. a Distributed, Consistent, Scalable, SQL Database

CSE-E5430 Scalable Cloud Computing Lecture 9

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

COSC 6339 Big Data Analytics. NoSQL (II) HBase. Edgar Gabriel Fall HBase. Column-Oriented data store Distributed designed to serve large tables

In-Memory Data Management Jens Krueger

1

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Prometheus For Big & Little People Simon Lyall

Managing IoT and Time Series Data with Amazon ElastiCache for Redis

A Global In-memory Data System for MySQL Daniel Austin, PayPal Technical Staff

Guide Users along Information Pathways and Surf through the Data

Chapter 24 NOSQL Databases and Big Data Storage Systems

ClickHouse Deep Dive. Aleksei Milovidov

Aaron Sun, in collaboration with Taehoon Kang, William Greene, Ben Speakmon and Chris Mills

Bigtable. Presenter: Yijun Hou, Yixiao Peng

Transcription:

1 Time Series Schemas @Percona Live 2017

Who Am I? Chris Larsen Maintainer and author for OpenTSDB since 2013 Software Engineer @ Yahoo Central Monitoring Team Who I m not: A marketer A sales person 2

What are Time Series? Time Series: A sequence of discrete data points (values) ordered and indexed by time associated with an identity. E.g.: web01.sys.cpu.busy.pct 45% 1/1/207 12:01:00 web01.sys.cpu.busy.pct 52% 1/1/207 12:02:00 web01.sys.cpu.busy.pct 35% 1/1/207 12:03:00 ^ Identity ^ Value ^ Timestamp 3

4 What are Time Series?

What are Time Series? Data Point: Metric + Tags + Value: 42 + Timestamp: 1234567890 ^ a data point ^ 5 sys.cpu.user 1234567890 42 host=web01 cpu=0 Payload could also be a string, a blob, a histogram, etc.

Chose your own Adventure! You re developing a new app and want to see how long it takes to call that backend service. A web server is super slow and you want to track connections and latencies without parsing logs. You re running a lab experiment and want to count cell divisions per second. 6

In the Beginning Flat Files Slap in some code to append to a file. Import CSVs to Excel and graph it! PLUS: - Easy to share - Easy to parse with code 7

Chose your own Adventure! Co-workers: I like your instrumentation and graphs! We have more (apps servers experiments) for you to instrument. Can you do it? And give us a UI and CLI and (etc, etc, etc) You:... Sure! 8

In the Beginning Flat Files Now you see some problems: Many series == many files. How do you query lots of files? What if you grow to the point you re thrashing the disk IO? Roll your own query and join code between files. Roll your own graphing server, CLI etc. 9

RDBMS to the rescue! Pros: Industry standard APIs and tools. Standard query language with transforms, filtering, etc. Replication, backups, high availability. Lots of vendors (OSS and paid) to choose from. Just have to create a UI. 10

First Schema: Index on metric and timestamp. Easy to query for time ranges and specific metrics. SELECT max(value) FROM timeseriestable WHERE metric = 'web01.sys.cpu.busy.pct' AND timestamp BETWEEN '2011-05-07' AND '2011-05-07 23:59:59:999' 11

Chose your own Adventure! Co-workers: I SQL so much! Thank you! By the way, we re going to push 1000 new metrics per second in an hour. Have a great lunch break. You:... 12

First Schema: Cons: More metrics and/or more frequent data means: Bigger and bigger indices Slower queries as the data set grows Deleting data to cleanup huge tables takes longer 13

Second Schema: Shard tables by month (later on by day, then hour ). Join across tables in the DB or in app. Delete old data by dropping a table. Room to grow. SELECT max(value) FROM timeseriestable_2011_05_07 WHERE metric = 'web01.sys.cpu.busy.pct' AND timestamp BETWEEN '2011-05-07' AND '2011-05-07 23:59:59:999' 14

Chose your own Adventure! Co-workers: Thanks for bringing the DB back up but it s down again. I think it could be because the? group started pushing 100,000 metrics per second and are now sending metrics like host.system.cpu.core.busy.pct. You:... oh. 15

Second Schema: Cons: While it helps buy some time, with continued growth you still have the problems of V1. One abuser can easily take down your system. 16

17 Third Schema: Shard tables by time and group. (even by server) Reduce storage by using UID tables. SELECT max(ts.value), m.metric, h.host, dc.datacenter FROM groupa_2011_05_07 ts JOIN datacenters dc ON ts.datacenterid = dc.datacenterid JOIN metrics m ON ts.metricid = m.metricid JOIN hosts h ON ts.hostid = h.hostid WHERE m.metric = 'web01.sys.cpu.busy.pct' AND h.host REGEXP 'web.*' AND dc.datacenter IN ('lga', 'phx') AND ts.timestamp BETWEEN '2011-05-07' AND '2011-05-07 23:59:59:999'

Chose your own Adventure! 18 Co-workers: Great work on the schema! Those queries are so much faster. Now we need more dimensions like X, Y, Z, Z, etc. Can we also store JSON events, Git commits, strings, histograms and get some alerting? You: sigh Your wish is my command.

Third Schema: Cons: Doesn t allow for unbounded dimensions (tags). Requires complex shard routing code. Different columns or tables per data type or stored procedures to encode/decode blobs. 19

20 Explore Dedicated Time Series Systems!

Problems to Solve: Handle unbounded metrics and dimensions. Handle high cardinality dimensions. E.g. userid=? where unique(userid) >= 1M Query wide time ranges at lower resolution. E.g. use time rollups for 1 year queries. Aggregate multiple time series into single views. E.g. sum(sys.if.traffic_in) where datacenter = phx. Perform transformations and extract useful analytics. E.g. Top 10 highest traffic hosts. 99th percentile query latency. Replication, High Availability, Write and Read throughput. 21

22 1990 s - MRTG and RRDTool

23 1990 s - MRTG and RRDTool Schema: Circular buffer, fixed time interval and numeric data. Pros: Fixed file sizes with lower resolution storage. Built in graphing and simple methods. Portable, backup-able. Cons: Many series == many files == IO thrashing. No replication/ha.

24 1990 s - KDB+, Informix Schema:? Proprietary. Pros: Designed for time series. Complex analysis. Commercial support. Cons: Commercial fees. Little integration with open-source

2000 s - Graphite Schema: Circular buffer, fixed time interval and numeric data. Pros: Aggregations and rollups available. Transform functions and dashboarding. Working on distributed stores. Cons: Lack of replication/ha. Same as RRDTool. 25

2010 - OpenTSDB Open Source Time Series Database based on Google s in-house time series DB. Store trillions of data points at millions of writes per second. Keeps raw data at the original timestamp and precise value. Keep it forever or TTL it out. Scales using HBase or Bigtable. Provides multi-series analysis. 26

What are HBase and Bigtable? 27 HBase is an OSS distributed LSM backed hash table based on Google s Bigtable. Key value, row based column store. Sorted by row, columns and cell versions. Supports: o Scans across rows with filters. o Get specific row and/or columns. o Atomic operations. CP from CAP theorem.

OpenTSDB Schema Row key is a concatenation of UIDs and time: o salt + metric + timestamp + tagk1 + tagv1 + tagkn + tagvn sys.cpu.user 1234567890 42 host=web01 cpu=0 \x01\x00\x00\x01\x49\x95\xfb\x70\x00\x00\x01\x00\x00\x01\x00\x00\x02\x00\x00\x02 Timestamp normalized on hour or daily boundaries. All data points for an hour or day are stored in one row. Data: VLE 64 bit signed integers or single/double precision signed floats, Strings and raw histograms. Saves storage space but requires UID conversion. 28

OpenTSDB Schema Row Key m t1 tagk1 tagv1 m t1 tagk1 tagv2 m t1 tagk1 tagv1 tagk2 tagv3 Columns (qualifier/value) o1/v1 o2/v2 o3/v3 o1/v1 o2/v2 o1/v1 o2/v2 o3/v3 m t1 tagk1 tagv2 tagk2 tagv4 o1/v1 o3/v3 m t1 tagk3 tagv5 m t1 tagk3 tagv6 o1/v1 o2/v2 o3/v3 o2/v2 m t2 tagk1 tagv1 o1/v1 o3/v3 m t2 tagk1 tagv2 o1/v1 o2/v2 29

OpenTSDB Use Cases Backing store for Argus: Open source monitoring and alerting system. 50M writes per minute. ~4M writes per TSD per minute. 23k queries per minute. https://github.com/salesforce/argus 30

OpenTSDB Use Cases Monitoring system, network and application performance and statistics. Single cluster: 10M to 18M writes/s ~ 3PB. Multi-tenant and Kerberos secure HBase. ~200k writes per second per TSD. Central monitoring for all Yahoo properties. Over 1 billion active time series served. Leading committer to OpenTSDB. 31

32 Other Users

33 OpenTSDB Pros: Scalable with HBase/HDFS or hosted Google Bigtable including replication. Annotation and distributed histograms (digests). Rollup, pre-aggregate support. Built-in graphing and analytics or use OSS tools (Grafana). Cons: Distributed HBase is complex. (Hosted Bigtable easy). UID resolution and current lack of metadata.

OpenTSDB For version 3.0: New query engine with: Distributed queries. Time based caching. Write-through caching using Facebook Beringei. Pluggable storage engines. Anomaly detection via machine learning. 34

2010 s - Druid Schema: Time-sharded columnular segments with bitmapped indexes to dictionary strings. In memory and on-disk stores with distributed queries. Pros: Scalable with HDFS or S3, including replication. Analytics and mutations with OLAP slicing and dicing. Time-based rollups and pre-aggregates. Cons: Complex infrastructure. Similar cardinality issue as TSDB. 35

2010 s - InfluxDB Schema: Custom Time structured Log Structured Merge engine. Pros: Flexible SQLish query language. Time-based rollups available. Nanosecond precision. Cons: Embryonic clustering support (no longer open sourced). Similar cardinality issues as other stores. Still working on scaling. 36

Today - Many more DalmatinerDB 37 https://misfra.me/2016/04/09/tsdb-list/

Back to RDBMS? Still possible: Separate meta data (names, dimensions) from values. Shard across servers using abstraction layers, coordinators. Custom SQL plugins. 38

39 More Info and Credits Thanks to the Monitoring and HBase teams at Yahoo, Pythian for Bigtable support and our OSS contributors! Contribute at github.com/opentsdb/opentsdb Website: opentsdb.net Mailing List: groups.google.com/group/opentsdb Images https://commons.wikimedia.org/wiki/file:programmer_writing_code_with_unit_tests.jpg http://www.doncio.navy.mil/chips/articledetails.aspx?id=8098 https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0ahukewia9ruw9l3 TAhWr0FQKHTacC4kQjRwIBw&url=https%3A%2F%2Fpixabay.com%2Fen%2Fphotos%2Fthumbs%2520up%2F&psi g=afqjcngw50t6xhh7no6swxmd57qyzig6cg&ust=1493151014332670 https://commons.wikimedia.org/wiki/file:twemoji_1f626.svg https://xkcd.com/1425/ https://commons.wikimedia.org/wiki/emoji#/media/file:twemoji_1f623.svg https://c1.staticflickr.com/1/508/32307332875_40e73bf750_b.jpg