Kibana, Grafana and Zeppelin on Monitoring data

Similar documents
Monitoring for IT Services and WLCG. Alberto AIMAR CERN-IT for the MONIT Team

Building a Scalable Recommender System with Apache Spark, Apache Kafka and Elasticsearch

EZY Intellect Pte. Ltd., #1 Changi North Street 1, Singapore

Big Data Analytics Tools. Applied to ATLAS Event Data

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Tableau Training Content

Monitoring system for geographically distributed datacenters based on Openstack. Gioacchino Vino

#mstrworld. Analyzing Multiple Data Sources with Multisource Data Federation and In-Memory Data Blending. Presented by: Trishla Maru.

microsoft

Big Data Tools as Applied to ATLAS Event Data

Analytics Platform for ATLAS Computing Services

BIG DATA COURSE CONTENT

Scientific data processing at global scale The LHC Computing Grid. fabio hernandez

MicroStrategy Desktop Quick Start Guide

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti

NA120 Network Automation 10.x Essentials

Cloud Computing 3. CSCI 4850/5850 High-Performance Computing Spring 2018

Overview. SUSE OpenStack Cloud Monitoring

Working with Feature Layers. Russell Brennan Gary MacDougall

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET

MicroStrategy Desktop

Fault Detection using Advanced Analytics at CERN's Large Hadron Collider

MicroStrategy Analytics Desktop

Course Contents: 1 Business Objects Online Training

How-to: standard RPA execution reporting with Kibana

Hue Application for Big Data Ingestion

ArcGIS Enterprise: An Introduction. Philip Heede

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service

Monitoring MySQL with Prometheus & Grafana

Is Elasticsearch the Answer?

Graphite and Grafana

FUJITSU Software ServerView Cloud Monitoring Manager V1.0. Overview

Monitoring of large-scale federated data storage: XRootD and beyond.

Network Automation using modern tech. Egor Krivosheev 2degrees

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

ArcGIS Enterprise: Portal Administration BILL MAJOR CRAIG CLEVELAND

SAS Web Report Studio 3.1

Spotfire: Brisbane Breakfast & Learn. Thursday, 9 November 2017

Tableau. training courses

Microsoft End to End Business Intelligence Boot Camp

Advanced ecommerce Monitoring one tool does it all

BUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office)

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

alteryx training courses

Interstage Business Process Manager Analytics V11.1. Overview. Windows/Solaris/Linux

Ad Hoc Reporting with Report Builder

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

TIBCO Spotfire Online Training

Innovatus Technologies

MS-55045: Microsoft End to End Business Intelligence Boot Camp

Goal of this document: A simple yet effective

Enterprise Vault 12.4 OData Reporting for Auditing

IN: US:

Data Analyst Nanodegree Syllabus

Application monitoring with BELK. Nishant Sahay, Sr. Architect Bhavani Ananth, Architect

Firefox Crash Reporting.

PYRAMID Headline Features. April 2018 Release

Hadoop. Introduction / Overview

Sample Data. Sample Data APPENDIX A. Downloading the Sample Data. Images. Sample Databases

ACHIEVEMENTS FROM TRAINING

Installing Apache Knox

*Gartner Magic Quadrant for Business Intelligence and Analytics Platforms, by Rita L. Sallam, Cindi Howson, Carlie J. Idoine, Thomas W.

MSc(IT) Program. MSc(IT) Program Educational Objectives (PEO):

QUARK AUTHOR THE SMART CONTENT TOOL. INFO SHEET Quark Author

SAP Roambi SAP Roambi Cloud SAP BusinessObjects Enterprise Plugin Guide

WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers.

How to choose the right approach to analytics and reporting

September Development of favorite collections & visualizing user search queries in CERN Document Server (CDS)

Progress in Machine Learning studies for the CMS computing infrastructure

Index. Scott Klein 2017 S. Klein, IoT Solutions in Microsoft s Azure IoT Suite, DOI /

Data Architectures in Azure for Analytics & Big Data

Introduction to Cognos Participants Guide. Table of Contents: Guided Instruction Overview of Welcome Screen 2

Analyze Bug Statistics using Kibana Dashboard and Get Voice Alerts

resources, 56 sample questions, 3 Business Intelligence Development Studio. See BIDS

Best Practices for Choosing Content Reporting Tools and Datasources. Andrew Grohe Pentaho Director of Services Delivery, Hitachi Vantara

6 SSIS Expressions SSIS Parameters Usage Control Flow Breakpoints Data Flow Data Viewers

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Power BI Architecture

The software shall provide the necessary tools to allow a user to create a Dashboard based on the queries created.

TIBCO Spotfire Course contents

Big Data Architect.

Business Intelligence Launch Pad User Guide SAP BusinessObjects Business Intelligence Platform 4.1 Support Package 1

Administering System Center 2012 Configuration Manager

PeopleSoft Pivot Grids A through Z!

Machine Learning analysis of CMS data transfers

Data Analyst Nanodegree Syllabus

Search Engines and Time Series Databases

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

SAP BusinessObjects Integration Option for Microsoft SharePoint Getting Started Guide

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

The Art of Container Monitoring. Derek Chen

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS

Vela Web User Guide Vela Systems, Inc. All rights reserved.

Product Range 3SL. Cradle -7

C3PO - A Dynamic Data Placement Agent for ATLAS Distributed Data Management

Chronix A fast and efficient time series storage based on Apache Solr. Caution: Contains technical content.

SAS Visual Analytics 8.2: Getting Started with Reports

Implementing Data Models and Reports with Microsoft SQL Server Exam Summary Syllabus Questions

Business Analytics Nanodegree Syllabus

Transcription:

Kibana, Grafana and Zeppelin on Monitoring data Internal group presentaion Ildar Nurgaliev OpenLab Summer student

Presentation structure About IT-CM-MM Section and myself Visualisation with Kibana 4 and Grafana Motivation and comparison Reporting and Plotting with Zeppelin Motivation and comparison Improvement by using Spark 11/08/2016 Ildar Nurgaliev CERN openlab 2

Experience Education Ildar Nurgaliev Start End University name Degree Faculty 2011 2015 KFU - High School ISIT Bachelor Applied Informatics 2014 2015 Innopolis university Bachelor Artificial Intelligence 2015 now Innopolis university Master Data Science 2013 2014 2014 2016 Solution Developer at Fujitsu GDC (Global Delivery Center) - enterprise application desktop application thin web client (Ext-JS, Java, Spring-framework, Hibernate, Swing) Researcher - Cognitive Architecture NEUCOGAR. - Graph matching and percolation theory for huge graphs. (C++, math algorithms, Random graph models) 11/08/2016 Ildar Nurgaliev CERN openlab 3

Since 2014 11/08/2016 Ildar Nurgaliev CERN openlab 4

IT-CM-MM Section (Monitoring) Evolution of Monitoring tools for the CERN Tier-0 & WLCG Monitoring of DC at CERN and Wigner meter, timber Experiment Dashboards data transfers, job information, WLCG reports 11/08/2016 Ildar Nurgaliev CERN openlab 5

Visualisation Comparison Upcoming upgrade of DC Monitoring (meter, timber) from Kibana 3 Data Centre Overview dashboard Kibana4 Grafana Investigation on Grafana Porting Host Metrics Porting FTS dashboard ACL and SSO Comparison of Kibana vs. Grafana (Table) 11/08/2016 Ildar Nurgaliev CERN openlab 6

Data Centre Overview Kibana 3 11/08/2016 Ildar Nurgaliev CERN openlab 7

Data Centre Overview Kibana 4 11/08/2016 Ildar Nurgaliev CERN openlab 8

Data Centre Overview Grafana 11/08/2016 Ildar Nurgaliev CERN openlab 9

HOST Metrics Kibana 3 11/08/2016 Ildar Nurgaliev CERN openlab 10

Kibana 3 Mem Util. Very similar plots HOST Metrics 11/08/2016 Ildar Nurgaliev CERN openlab 11

Grafana Templated repeat visualisation HOST Metrics 11/08/2016 Ildar Nurgaliev CERN openlab 12

Grafana HOST Metrics Auto visualisation generate: Mem Util 11/08/2016 Ildar Nurgaliev CERN openlab 13

FTS Monitoring Kibana 4 11/08/2016 Ildar Nurgaliev CERN openlab 14

FTS Monitoring Grafana Select VO = ATLAS;CMS 11/08/2016 Ildar Nurgaliev CERN openlab 15

Grafana Some endpoints selected FTS Monitoring 11/08/2016 Ildar Nurgaliev CERN openlab 16

FTS Monitoring Grafana Automated Menu (Grafana feature) 11/08/2016 Ildar Nurgaliev CERN openlab 17

Grafana Transfer Sites Dashboard FTS Monitoring 11/08/2016 Ildar Nurgaliev CERN openlab 18

Kibana 4 FTS Monitoring Ranking Country Dashboard (nominal axis) 11/08/2016 Ildar Nurgaliev CERN openlab 19

FTS Monitoring Grafana Ranking Country Dashboard (tables, pie, NO bar charts!) 11/08/2016 Ildar Nurgaliev CERN openlab 20

ACL in Grafana For testing purposes 3 organisations (groups) created: MONIT - default for newcomers (Viewer permission) [contains all the dashboards now] ATLAS CMS Tried to see whether users can access, read or write across those organisations Summary: Users could be assigned to different organizations Using this multiple subscription we could attach an LDAP as centralised ACL Grafana has good support for SSO SSO accounts automatically mapped Nested groups are not supported in Grafana 11/08/2016 Ildar Nurgaliev CERN openlab 21

Feature Aspect Search and exploring - Plots look refined without banners or edit button - Expandable visualizations - Lucene query for visualisation - By dashboard developer - Not very refined Comparison - basically it has 'data exploration' - show document structure - inbuild search highlights - Save search as object - By dashboard user Reusage objects - No visualisation reuse - Manual repeat search query Visualisation, Dashboards, Search are saved as objects (plugable into many dashboards) General plots All plots are Time-series based - Time series - Nominal axis - Basic Heat Map Export & Share Visualisation -> CSV, PNG, Render image url Visualisation -> CSV, JSON 11/08/2016 Ildar Nurgaliev CERN openlab 22

Comparison Feature Role-based access (RBA) - RBA/ACL supported by default as Organisations - no built-in RBA/ACL - Commercial plug-in Plotting derived fields Combined plots Plots from differe nt sources Yes Scripted fields Yes very flexible visualisation Overlapping plots Yes As many as you wish (ES, Graphite, Influx Db) Yes Scripted fields, TimeLion purposeful tool No No One ES source only Support for Templates Yes Automatic visualisations from limited set of values for a var, change datasource on the fly No 11/08/2016 Ildar Nurgaliev CERN openlab 23

Scrutiny Report Plots Scrutiny Report: Overview of usage of all WLCG resources (more than 50 pages) Motivation: Automate generation of plots for the ATLAS report with Zeppelin Datasets popularity plots Discovery of unused data Used to optimise improve data management policies (e.g. replicas and lifetime of data) 11/08/2016 Ildar Nurgaliev CERN openlab 24

Current Workflow - Input Data Cronjobs run PIG (hadoop) scripts that aggregate ATLAS datasets access events from Hadoop Summaries generated on a web server as CSV files CSV files manually downloaded and imported in Excel 11/08/2016 Ildar Nurgaliev CERN openlab 25

Current Workflow - Plots Cronjobs run PIG (hadoop) scripts that aggregate ATLAS datasets access events from Hadoop Summaries generated on a web server as CSV files CSV files manually downloaded and imported in Excel 11/08/2016 Ildar Nurgaliev CERN openlab 26

Why Zeppelin Web-based notebook that Enables interactive data analytics with powerful dynamic visualisations Supports several technologies out of the box (Python, SQL, Spark, Hadoop, etc) Provides interactive forms: (Text Input Forms, Select Forms, Checkbox forms, ) 11/08/2016 Ildar Nurgaliev CERN openlab 27

Datasets Popularity Plots Starting from the same CSV files Show the number of times ATLAS data were accessed to find datasets usage Volumes of data (V) vs. number of accesses (X) for last N months Number of times accessed in periods of 3, 6, 9, 12 months and infinity 11/08/2016 Ildar Nurgaliev CERN openlab 28

Sample Plots - View PyPlot for reportquality plots 0-access bins Old and new datasets Most accessed datasets Interactive built-in plots for discovery 11/08/2016 Ildar Nurgaliev CERN openlab 29

Developer s view Zeppelin Notebooks User s view 11/08/2016 Ildar Nurgaliev CERN openlab 30

Discovery of Unused Datasets Starting from the same CSV files Show Top-N unused data by project/datatype for last X months for every month it is clear which datasets/datatypes are unused (e.g. mc16_7tev) 11/08/2016 Ildar Nurgaliev CERN openlab 31

Unused Data by Type and Creation Time 11/08/2016 Ildar Nurgaliev CERN openlab 32

Aggregation with Spark - Ongoing Replace PIG scripts with Spark Improve execution speed Use Mesos/Chronos monitoring infrastructure for scheduling jobs 11/08/2016 Ildar Nurgaliev CERN openlab 33

Thank you CERN! List of studied technologies Elasticsearch Kibana dashboard Grafana dashboard Hadoop Apache PIG Apache Spark Zeppelin notebook 11/08/2016 Ildar Nurgaliev CERN openlab 34