Using Prometheus with InfluxDB for metrics storage

Similar documents
@InfluxDB. David Norton 1 / 69

The Art of Container Monitoring. Derek Chen

Hynek Schlawack. Get Instrumented. How Prometheus Can Unify Your Metrics

Monitoring MySQL Performance with Percona Monitoring and Management

Prometheus as a (internal) service. Paul Traylor LINE Fukuoka

Open Source Database Performance Optimization and Monitoring with PMM. Fernando Laudares, Vinicius Grippa, Michael Coburn Percona

Search Engines and Time Series Databases

Search and Time Series Databases

Monitoring MySQL with Prometheus & Grafana

Prometheus For Big & Little People Simon Lyall

Monitoring MySQL Performance with Percona Monitoring and Management

Road to Auto Scaling

Operating Within Normal Parameters: Monitoring Kubernetes

Chronix A fast and efficient time series storage based on Apache Solr. Caution: Contains technical content.

Rethinking monitoring with Prometheus

Real-time monitoring Slurm jobs with InfluxDB September Carlos Fenoy García

Staleness and Isolation in Prometheus 2.0. Brian Brazil Founder

Using PostgreSQL, Prometheus & Grafana for Storing, Analyzing and Visualizing Metrics

Time Series Live 2017

Open-Falcon A Distributed and High-Performance Monitoring System. Yao-Wei Ou & Lai Wei 2017/05/22

Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in Operational Data

PostgreSQL monitoring with pgwatch2. Kaarel Moppel / PostgresConf US 2018

Relabeling Julien Pivotto PromConf Munich August 9, 2017

Performance Monitoring and Management of Microservices on Docker Ecosystem

Visualize Your Data With Grafana Percona Live Daniel Lee - Software Engineer at Grafana Labs

Prometheus. A Next Generation Monitoring System. Brian Brazil Founder

Inside the InfluxDB Storage Engine

Kubernetes. Introduction

Monitoring Docker Containers with Splunk

Quo vadis, Prometheus?

README file for TICKpy (CogSys) Container v0.9.4

Monitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team

Evolution of the Prometheus TSDB. Brian Brazil Founder

Monitoring MySQL with Prometheus Ben Kochie - Prometheus Lead - GitLab

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

A practical guide to monitoring and alerting with time series at scale

Pursuit of stability. Growing AWS ECS in production. Alexander Köhler Frankfurt, September 2018

Evolving Prometheus for the Cloud Native World. Brian Brazil Founder

OnCommand Cloud Manager 3.2 Deploying and Managing ONTAP Cloud Systems

Federated Prometheus Monitoring at Scale

Monitoring Cloud Native applications with Prometheus. Aaron Weaveworks

Zero to Microservices in 5 minutes using Docker Containers. Mathew Lodge Weaveworks

Administering Microsoft SQL Server 2012 Databases

OnCommand Unified Manager

Administering Microsoft SQL Server 2012 Databases

COURSE 20462C: ADMINISTERING MICROSOFT SQL SERVER DATABASES

The InfluxDB-Grafana plugin for Fuel Documentation

Professional PostgreSQL monitoring made easy. Kaarel Moppel - p2d2.cz 2019 Prague

Basic knowledge of the Microsoft Windows operating system and its core functionality.

OSM Hackfest Session 6 Performance & Fault Management Benjamín Díaz (Whitestack)

Graph and Timeseries Databases

Course Content MongoDB

MS-20462: Administering Microsoft SQL Server Databases

Monasca. Monitoring/Logging-as-a-Service (at-scale)

Package your Java Application using Docker and Kubernetes. Arun

Monitoring Testbed Experiments with MonEx

Persistent Storage with Docker in production - Which solution and why?

Network Automation using modern tech. Egor Krivosheev 2degrees

Datacenter replication solution with quasardb

Effecient monitoring with Open source tools. Osman Ungur, github.com/o

ShardProxy Replication-Manager. DataOps - Juin 2018 Kentoku SHIBA - Stephane VAROQUI

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

Data pipelines with PostgreSQL & Kafka

Using the Cisco Unified Analysis Manager Tools

MMS Backup Manual Release 1.4

Zadara Enterprise Storage in

Maintaining a Microsoft SQL Server 2008 Database (Course 6231A)

Administering a SQL Database Infrastructure (M20764)

Microservices. Chaos Kontrolle mit Kubernetes. Robert Kubis - Developer Advocate,

GoDocker. A batch scheduling system with Docker containers

CrateDB for Time Series. How CrateDB compares to specialized time series data stores

VMWARE VREALIZE OPERATIONS MANAGEMENT PACK FOR. MongoDB. User Guide

Survey and Comparison of Open Source Time Series Databases

observability and product release: leveraging prometheus to build and test new products digitalocean.com

Setting up Kubernetes with Day 2 in Mind. Angela Chin, Senior Software Engineer, Pivotal Urvashi Reddy, Senior Software Engineer, Pivotal

Cloud Monitoring as a Service. Built On Machine Learning

Course 6231A: Maintaining a Microsoft SQL Server 2008 Database

@joerg_schad Nightmares of a Container Orchestration System

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver

Voltha Architecture in a clustered HA configuration. Sergio Slobodrian, Ciena CORD Build Wed, November 7 th, 2017

The Internet of Things:

PCP: Ingest and Export


Maintaining a Microsoft SQL Server 2008 R2 Database

Regain control thanks to Prometheus. Guillaume Lefevre, DevOps Engineer, OCTO Technology Etienne Coutaud, DevOps Engineer, OCTO Technology

Course AZ-100T01-A: Manage Subscriptions and Resources

Course 6231A: Maintaining a Microsoft SQL Server 2008 Database

6231B - Version: 1. Maintaining a Microsoft SQL Server 2008 R2 Database

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

Thales PunchPlatform Agenda

Kubernetes 1.9 Features and Future

Deploying and Using ArcGIS Enterprise in the Cloud. Bill Major

PUBLIC SAP Vora Sizing Guide

Developing Microsoft Azure Solutions (70-532) Syllabus

MySQL Database Administrator Training NIIT, Gurgaon India 31 August-10 September 2015

ArcGIS Enterprise: Advanced Topics in Administration. Thomas Edghill & Moginraj Mohandas

ESC in Maintenance Mode

Ingest. David Pilato, Developer Evangelist Paris, 31 Janvier 2017

ExtraHop 6.0 ExtraHop REST API Guide

VMWARE VREALIZE OPERATIONS MANAGEMENT PACK FOR. Nagios. User Guide

Transcription:

Using Prometheus with InfluxDB for metrics storage Roman Vynar Senior Site Reliability Engineer, Quiq September 26, 2017

About Quiq Quiq is a messaging platform for customer service. https://goquiq.com We monitor all our infrastructure with 1 Prometheus: 190 targets, 190K time-series, 10K samples/sec ingestion rate. We store customer-related and developer metrics of all the micro-services in InfluxDB using in-house InfluxDB HA implementation. 2

3 Time-series databases

Prometheus Prometheus is 100% open-source and community-driven Modern and efficient Multi-dimensional data model Collection via pull model Powerful query language and HTTP API Service discovery Alerting toolkit and integrations Federation of Prometheis 4

5 Prometheus architecture

InfluxDB Open-source and commercial offering Modern and efficient Multi-dimensional data model Collection via push model SQL and HTTP API A component of a full-stack platform Backup and restore Clustering (proprietary, commercial) 6

7 InfluxDB architecture

Time-series structure Prometheus: metric{job=" ", instance=" ", label1=" ", label2=" "} float64 timestamp (ms) gauge counter histogram summary InfluxDB: db.retention.measurement tag1=" ",tag2=".." field1=bool,field2="string",field3=int float64 timestamp (ns) 8

Prometheus 1.7.1 vs InfluxDB 1.3.5 Feature Prometheus InfluxDB Metrics collection model Pull Push Storage Ephemeral Long-lived Data retention A single, global Multiple, per database Service discovery Built-in N/A Clustering Federation Commercial Downsampling Recording rules Continuous queries Query language PromQL InfluxSQL Backup and restore Another prom instance Binary and raw formats Integrations Components, 3rd-party TICK stack, 3rd-party 9

Prometheus and pull Prometheus scrapes metrics from remote exporters Configurable frequency of scraping Relabeling Simple protocol-buffer or text-based exposition format Custom on-demand metrics via textfile collector of node_exporter "Push" is also possible via pushgateway 10

Prometheus storage and retention A sophisticated local storage subsystem Chunks of constant size for the bulk sample data LevelDB for indexes Circular global retention Not really designed for long-term storage 11

Prometheus service discovery Service discovery out the box DNS Consul AWS GCP Azure Kubernetes Openstack Dynamic and flexible configuration 12

Prometheus federation Federation allows a Prometheus server to scrape selected time series from another Prometheus server. Hierarchical federation Cross-service federation 13

Prometheus recording rules Recording rules allow you to precompute frequently needed or expensive expressions and save their result as a new set of time series Can be used for downsampling 14

PromQL Prometheus provides a functional expression language that lets the user select and aggregate time series data in real time. Cross-metric queries Grouping and joins Functions over functions 15

Prometheus and backups No backup mechanism However, you can run multiple Prometheus instances to do exactly the same job to keep a standby copy. 16

Prometheus integrations Grafana Alertmanager Dropwizard, Gitlab, Docker, etc. InfluxDB: read, write OpenTSDB: write Chronix: write Graphite: write PostgreSQL/TimescaleDB: read, write 17

Prometheus vs InfluxDB Feature Prometheus InfluxDB Metrics collection model Pull Push Storage Ephemeral Long-lived Data retention A single, global Multiple, per database Service discovery Built-in N/A Clustering Federation Commercial Downsampling Recording rules Continuous queries Query language PromQL InfluxSQL Backup and restore Another prom instance Binary and raw formats Integrations Components, 3rd-party TICK stack, 3rd-party 18

InfluxDB and push Telegraph pushes samples to InfluxDB There are 100+ plugins for Telegraphs "Push" on demand 19

InfluxDB storage and retention Compressed and encoded data are organized in shards with duration Shards are grouped into shard groups by time and duration Multiple databases Multiple retentions per database Each database has its own set of WAL and TSM files 20

InfluxDB downsampling Configurable retentions per database Continuous queries across retentions and databases Flexible time grouping, resampling intervals and offsets Commercial clustering ensures the data is copied to X replicas 21

InfluxQL SQL-like language Schema exploration Flexible grouping by time intervals No joins No functions over functions 22

InfluxDB backup and restore Built-in backup/restore tool Backup/restore a specific database/retention/shard Backup since a specific date Separate backup of datastore and metastore HTTP API allows for a plain-text backup/restore too 23

InfluxDB integrations Kapacitor Chronograf Grafana Remote read/write by Prometheus 24

Prometheus vs InfluxDB Feature Prometheus InfluxDB Metrics collection model Pull Push Storage Ephemeral Long-lived Data retention A single, global Multiple, per database Service discovery Built-in N/A Clustering Federation Commercial Downsampling Recording rules Continuous queries Query language PromQL InfluxSQL Backup and restore Another prom instance Binary and raw formats Integrations Components, 3rd-party TICK stack, 3rd-party 25

Prometheus + InfluxDB Feature Prometheus InfluxDB Metrics collection model Pull Push Storage Ephemeral Long-lived Data retention A single, global Multiple, per database Service discovery Built-in N/A Clustering Federation Commercial Downsampling Recording rules Continuous queries Query language PromQL SQL Backup and restore Another prom instance Binary and raw formats Integrations Components, 3rd-party TICK stack, 3rd-party 26

What is better? InfluxDB: For event logging. Commercial option offers clustering for InfluxDB, which is also better for long term data storage. Eventually consistent view of data between replicas. Prometheus: Primarily for metrics. More powerful query language, alerting, and notification functionality. Higher availability and uptime for graphing and alerting. 27

Prometheus and InfluxDB integration Currently, there are 2 options: 1. Using remote_storage_adapter: https://github.com/prometheus/prometheus/tree/master/ documentation/examples/remote_storage/remote_storage_adapter 2. Writing to InfluxDB directly (nightly builds of not yet released v1.4): https://www.influxdata.com/blog/influxdb-now-supports-prometheusremote-read-write-natively/ (posted on Sep 14, 2017) 28

Prometheus and InfluxDB integration Prometheus Adapter InfluxDB 29

docker-compose.yml $ cat PL17-Dublin/docker-compose.yml version: '2' services: prom: image: prom/prometheus:v1.7.1 command: -storage.local.path="/promdata" ports: - "9090:9090" volumes: -./prometheus.yml:/prometheus/prometheus.yml:ro -./promdata:/promdata 30 influxdb: image: influxdb:1.3.5 command: -config /etc/influxdb/influxdb.conf ports: - "8086:8086" volumes: -./influxdata:/var/lib/influxdb

Running InfluxDB docker-compose up -d influxdb docker exec -ti pl17dublin_influxdb_1 influx > CREATE USER "admin" WITH PASSWORD 'admin' WITH ALL PRIVILEGES; docker exec -ti pl17dublin_influxdb_1 bash > influx >> auth >> CREATE DATABASE prometheus; >> CREATE USER "prom" with password 'prom'; >> GRANT ALL ON prometheus TO prom; >> ALTER RETENTION POLICY "autogen" ON "prometheus" DURATION 1d REPLICATION 1 SHARD DURATION 1d DEFAULT; >> SHOW RETENTION POLICIES ON prometheus; 31

Running remote_storage_adapter go get github.com/prometheus/prometheus/documentation/examples/ remote_storage/remote_storage_adapter INFLUXDB_PW=prom $GOPATH/bin/remote_storage_adapter -influxdb-url=http://localhost:8086 -influxdb.username=prom -influxdb.database=prometheus -influxdb.retention-policy=autogen 32

Prometheus config file global: scrape_interval: 1s scrape_timeout: 1s scrape_configs: - job_name: prometheus static_configs: - targets: ['localhost:9090'] labels: instance: prom remote_write: - url: http://docker.for.mac.localhost:9201/write 33

Running Prometheus and verification docker-compose up -d prom docker logs pl17dublin_prom_1 docker logs -f --tail 10 pl17dublin_influxdb_1 docker exec -ti pl17dublin_influxdb_1 bash > influx >> auth >> USE prometheus; >> SHOW MEASUREMENTS; 34

Downsampling with InfluxDB CREATE DATABASE trending; CREATE RETENTION POLICY "1m" ON trending DURATION 0s REPLICATION 1 SHARD DURATION 1w DEFAULT; CREATE RETENTION POLICY "5m" ON trending DURATION 0s REPLICATION 1 SHARD DURATION 1w DEFAULT; SHOW RETENTION POLICIES ON trending; USE prometheus; CREATE CONTINUOUS QUERY scrape_samples_scraped_1m ON prometheus BEGIN SELECT LAST(value) as "value" INTO trending."1m".scrape_samples_scraped FROM scrape_samples_scraped GROUP BY time(1m) END; CREATE CONTINUOUS QUERY scrape_samples_scraped_5m ON prometheus BEGIN SELECT LAST(value) as "value" INTO trending."5m".scrape_samples_scraped FROM scrape_samples_scraped GROUP BY time(5m) END; SHOW CONTINUOUS QUERIES; USE trending; SHOW MEASUREMENTS; SHOW SHARDS; SELECT * FROM trending."1m".scrape_samples_scraped; 35

Prometheus remote read (proxy to InfluxDB) $ cat PL17-Dublin/docker-compose.yml version: '2' services: promread: image: prom/prometheus:v1.7.1 command: -storage.local.engine=none ports: - "9091:9090" volumes: -./promread.yml:/prometheus/prometheus.yml:ro 36

Prometheus remote read Prometheus configuration: remote_read: - url: http://docker.for.mac.localhost:9201/read Start Prometheus with the above config: docker-compose up -d prom read 37

Questions? Thank you! 38