Model-Driven Telemetry and Analytics

Similar documents
Consuming Model-Driven Telemetry

PSOACI Tetration Overview. Mike Herbert

Technologies for the future of Network Insight and Automation

Cisco WAN Automation Engine (WAE) Network Programmability with Segment Routing

NXOS in the Real World Using NX-API REST

Introduction to OpenConfig

Network Automation using modern tech. Egor Krivosheev 2degrees

NSO in Brownfield: Fully Automated One-Click Reconciliation

Machine Learning with Python

Automation and Programmability using Cisco Open NXOS and DevOps Tools

Demystifying Machine Learning

Tetration Hands-on Lab from Deployment to Operations Support

DevOps CICD for VNF a NetOps Approach

Automation with Meraki Provisioning API

Empower your testing with Cisco Test Automation Solution Featuring pyats & Genie

Hands On Exploration of NETCONF and YANG

NetDevOps Style Configuration Management for the Network

Insights into your WLC with Wireless Streaming Telemetry

Routing Underlay and NFV Automation with DNA Center

Cisco Crosswork Network Automation

DNA Automation Services Offerings

Self-driving Datacenter: Analytics

CloudCenter for Developers

An Introduction to Developing for Cisco Kinetic

Model-Driven Telemetry. Shelly Cadora Principal Engineer, Technical Marketing

Cisco SD-Access Hands-on Lab

Managing The Digital Network Workforce Transformation

Getting Started with OpenStack

Hands-On with IoT Standards & Protocols

Get Hands On With DNA Center APIs for Managing Intent

NetDevOps for the Network Dude How to get started with API's, Ansible and Python

DNA Assurance. Predict Network Failures Before They Become Issues

Magical Chatbots with Cisco Spark and IBM Watson

Internet of Things Field Network Director

Cisco VIRL. The Swiss-Army Knife of Network Simulators. Simon Knight, Software Engineer Brian Daugherty, Technical Leader.

Cisco Tetration Analytics

Cisco UCS Agentless Configuration Management Ansible or Microsoft DSC

The Transformation of Media & Broadcast Video Production to a Professional Media Network

Cisco Container Platform

Cisco SD-Access Building the Routed Underlay

Contiv installation and integration with ACI

PnP Deep Dive Hands-on with APIC-EM and Prime Infrastructure

Cisco IOS XR Programmability for Cloud-Scale Networking

Introducing Cisco Network Assurance Engine

Benefits of SDN Modeling and Analytics tool for complex Service Provider Network

Kuber-what?! Learn about Kubernetes

DEVNET Introduction to Git. Ashley Roach Principal Engineer Evangelist

Migrating Applications with CloudCenter

Hybrid Cloud Automation using Cisco CloudCenter API

Cloud Mobility: Meraki Wireless & EMM

Ipswitch: The New way of Network Monitoring and how to provide managed services to its customers

APIC-EM / EasyQoS - End to End Orchestration of QoS in Enterprise Networks

Zero-Touch Operations - Managing Your Network as Code

Catalyst 9K High Availability Lab

Coding Intro to APIs and REST

DevNet Workshop-Hands-on with CloudCenter and Jenkins

Is your IT Infrastructure Ready for Machine Learning & Artificial Intelligence?

PSOACI Why ACI: An overview and a customer (BBVA) perspective. Technology Officer DC EMEAR Cisco

Cisco DNA Center and Italtel Netwrapper Evolution: Network and Applications come together

Serviceability of SD-WAN

Orange: Cisco & Orange: a human touch for a digital experience

VXLAN EVPN Fabric and automation using Ansible

Cloud-Ready WAN For IAAS & SaaS With Cisco s Next- Gen SD-WAN

TRex Realistic Traffic Generator

Introduction to Cisco IoT Tools for Developers IoT 101

BGP in the Enterprise for Fun and (fake) Profit: A Hands-On Lab

2018 Cisco and/or its affiliates. All rights reserved. Cisco Public

Connected Mobile Experiences (CMX) Aligning Use Cases and Technology

Contiv installation and integration with ACI. LTRCLD-2003

Next Gen Enterprise Management and Operations with Cisco DNA

Docker and Oracle Everything You Wanted To Know

Cisco UCS Director and ACI Advanced Deployment Lab

Resource Allocation Resource Usage Data Access Control. Network Intelligence, Guidance. Statistics, States, Objects and Events.

Customer s journey into the private cloud with Cisco Enterprise Cloud Suite

Git, Atom, virtualenv, oh my! Learn about dev tools to live by!

Your API Toolbelt Tools and techniques for testing, monitoring, and troubleshooting REST API requests

Software Innovations for Cloud Scale Networking. Kelly Ahuja Senior Vice President Service Provider Business, Products & Solutions November 18, 2015

Cisco Network Programmability for the Enterprise NPEN v1.0

Sourcefire Network Security Analytics: Finding the Needle in the Haystack

Enterprise Recording and Live Streaming Architecture with VBrick

Cisco Spark Messaging APIs - Integration Platforms as a Service Real World Use-Cases

Data Driven Networks

A Practical Look at DNA Center: A better way to manage your network in the digital era. Hands-On Lab

Designing and Implementing Cisco Network Programmability (NPDESI) v1.0

Cisco Enterprise Agreement

Cisco SAN Analytics and SAN Telemetry Streaming

Cisco Spark Widgets Technical drill down

Advanced CSR Lab with High Availability and Transit VPC

Diego R. López Telefónica I+D

Model Driven APIs for the Network Infrastructure Layer

Cisco Tetration Analytics

Getting Started With Containers

Spark SDK Video - Overview and Coding Demo

Cisco Virtualized Infrastructure Manager

Več kot SDN - SDA arhitektura v uporabniških omrežjih

Think Small to Scale Big

AlgoSec: How to Secure and Automate Your Heterogeneous Cisco Environment

Simplifying Collaboration Deployments with Prime Collaboration

One Platform Kit: The Power to Innovate

Cisco Spark. Questions? Use Cisco Spark to communicate with the speaker after the session. How

Transcription:

Model-Driven Telemetry and Analytics Steven Barth & Cristina Precup Software Systems Engineers

Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session in the Cisco Live Mobile App 2. Click Join the Discussion 3. Install Spark or go directly to the space 4. Enter messages/questions in the space cs.co/ciscolivebot# 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public

Scope of this session What we will cover: How Telemetry is consumed How you can implement a closedloop usecase with OS software How Telemetry and Machine Learning can play together What we will not cover: Telemetry and IOS XR basics Telemetry Track @ CL Barcelona DEVNET-1710 on Wednesday General intro, end-to-end overview BRKSPG-2999 on Wednesday Advanced Telemetry: Devices, scale and security you are here! LTR-2578 on Fri. 09.00-13.00 Consuming Telemetry: hands-on guided lab

Agenda Introduction Demo: Link-quality assurance with Model-Driven Telemetry Streaming Telemetry data from a device to a collector Storing and processing Telemetry data in a time series platform Visualizing data and reacting to events Demo: Telemetry-based vbng address pool management Machine Learning & Big Data use cases Conclusion

Traditional Monitoring Concepts No Longer suited for Cloud-Scale Network Operations Where Data Is Created Where Data Is Useful SNMP Sensing & Measurement syslog CLI Storage & Analysis Strong burden on back-end Normalize different encodings, transports, data models, timestamps 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 6

Streaming Telemetry Concepts Better suited for Cloud-Scale Network Operations Where Data Is Created Sensing & Measurement Streaming Telemetry Push paradigm One consistent way to access Statistics, Oper state & Events @ all layers High Performance: 10 sec Multiple encodings & Transport Where Data Is Useful Storage & Analysis Volume: Scale of Data Velocity: Analysis of Streaming Data Variety: Different Forms of Data 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 7

Streaming Telemetry Design Vision Performance Get as much data off the box as quickly as possible Coverage Grant full access to all operational data on the box* Automation Serialize the data in a flexible, efficient way that fits customers automated tools *User needs to have the correct privileges 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 8

Demo: Link-quality assurance with Model-Driven Telemetry

Closed-loop Telemetry setup with open-source analytics tools Cisco NSO IOS XRv 9000 Pipeline InfluxDB & Kapacitor Grafana IOS XRv 9000 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 10

Demo: Link-quality assurance with Model-Driven Telemetry Demo use case Introduce impairment (latency 50ms) XRv-AS65510-3 XRv-AS65510-5 3.3.3.3 5.5.5.5 GigE 0/0/0/1 GigE 0/0/0/0 Monitor quality metrics (CRC errors, dropped packets, latency, ) 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 11

Demo: Link-quality assurance with Model-Driven Telemetry Which operational YANG model? What are the keys / dimensions? Which sensors to measure? 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 12

Streaming Telemetry data from a device to a collector

Router Configuration IOS XR model-driven telemetry & GRPC What to stream? When to stream? Where to listen for collector? telemetry model-driven sensor-group sensor1 sensor-path Cisco-IOS-XR-ip-bfdoper:bfd/session-details/session-detail! subscription mdt-realtime sensor-group-id sensor1 sample-interval 5000!! grpc port 57777! 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 14

Pipeline An open-source Telemetry collector https://github.com/cisco/bigmuddy-network-telemetry-pipeline 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 15

Pipeline Configuration 1. Input Stage [default] id = pipeline What protocol to use? What encoding to use? Where to connect to? [dialin_xrv-as65510-3] stage = xport_input type = grpc encap = gpb encoding = gpbkv server = 172.16.1.65:57777 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 16

Pipeline Configuration 2. Filter Stage Which YANG model? Which values are key(s)? What is measured? [{ "basepath": "Cisco-IOS-XR-ip-bfd-oper: bfd/session-details/session-detail", "spec": {"fields": [ {"name": "interface-name", "tag": true}, {"name": "status-information", "fields": [ {"name": "latency-average"} ]} ]} }] 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 17

Pipeline Configuration 3. Output Stage What output type to use? What filter to use? (previous slide) Where to stream to? [metrics_influx] stage = xport_output type = metrics file = /etc/mdt-realtime/metrics.json datachanneldepth = 1000 output = influx influx = http://influxdb:8086 database = mdt_realtime workers = 10 username = admin 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 18

Storing and processing Telemetry data in a time series platform

Processing Telemetry data in a time series platform What is a time series database? Series of data points (e.g. link latency) with given keys (e.g. device + interface) Run statistical functions (e.g. maximum, moving average, ) Trigger alarms and events 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 20

Processing Telemetry data in a time series platform 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 21

Processing Telemetry data in a time series platform Influx DB Sensor Path Select stream Condition #1 Action #1 Condition #2 Action #2 var latency = stream from().database('mdt_realtime').retentionpolicy('autogen').measurement('cisco-ios-xr-ip-bfd-oper:bfd/session-details/session-detail').where(lambda: "Producer"=='XRv-AS65510-5').where(lambda: "interface-name"=='gigabitethernet0/0/0/0').where(lambda: "status-information latency-average" > 0) latency where(lambda: "status-information latency-average" <= 100000).crit(lambda: "status-information latency-average" <= 100000).exec('/telemetry-action.sh', 'enable') latency where(lambda: "status-information latency-average" > 100000).crit(lambda: "status-information latency-average" > 100000).exec('/telemetry-action.sh', 'disable') 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 22

Visualizing and reacting to events

Visualizing Data Using Grafana Monitoring system Basic statistics on the data Event-based alerting system Threshold of 10 ms 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 24

Reacting to Events Using NSO Applications Engineers REST, RESTConf, NETCONF, Java, Python, Erlang, CLI, Web UI Service Manager Service Model Device Manager Device Model Network Equipment Drivers (NEDs) NETCONF, REST, SNMP, CLI, etc VNFM Controller Apps EMS and NMS Physical Networks Virtual Networks Network Apps 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 25

Reacting to Events Using Ansible Agentless Widely adopted Infrastructure as code Simple to use and learn: YAML playbooks Community and vendor driven Modular open-source framework, easily modified Leverage many common programming languages 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 26

Reconfiguring IOS XR devices using Ansible Playbooks Preamble: local execution Map inventory to playbook Use built-in IOS XR module Configuration to send - hosts: all gather_facts: no connection: local vars: cli: host: "{{ ansible_ssh_host }}" username: "{{ ansible_ssh_user default(cisco) }}" password: "{{ ansible_ssh_pass default(cisco) }}" port: "{{ ansible_ssh_port default(22) }}" tasks: - name: Avoid interface iosxr_config: provider: "{{ cli }}" parents: - "router isis {{ isis_name }}" - "interface {{ interface_name }}" lines: ["address-family ipv4 unicast metric maximum ] 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 27

Assembling the open-source Telemetry platform Docker and the evolution of virtualization Monolithic Bare Metal Monolithic Virtual Machine Application Containers with Docker App App App App Lib Lib App App App App App App App App Lib OS Hardware Guest OS Hypervisor Host OS Hardware Guest OS Lib Lib Lib Docker Engine Host OS Hardware Lib 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 28

Assembling the open-source Telemetry platform Overview on Docker Pull Image A complete app environment; base OS, application, libraries, dependency Run Push Commit Registry A hub of Docker images Build Container An instance of a Docker image Pull Automated build with Dockerfile 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 29

Assembling the open-source Telemetry platform Composing and orchestration with Docker Compose Pipeline Grafana 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 30

Assembling the open-source Telemetry analytics platform Composing and orchestration with Docker Compose grafana: image: appcelerator/grafana:grafana-4.6.2 volumes: ["./grafana:/etc/extra-config/grafana"] ports: ["3000:3000"] kapacitor: build: kapacitor depends_on: [influxdb, pipeline, grafana] environment: -KAPACITOR_INFLUXDB=http://influxdb:8086 - INVENTORY=xrv9000 ansible_ssh_host=... - INTERFACE=${XR_INTERFACE} - ISIS=${XR_ISIS} Full demo source: https://github.com/sbyx/telemetry-breakout influxdb: image: influxdb:1.4.2-alpine pipeline: build: pipeline depends_on: [influxdb] environment: - MDT_DIALIN_PORT=57777 - MDT_DIALIN_SUBSCRIPTIONS=... - INFLUX_URL=http://influxdb:8086 - INFLUX_USERNAME=admin - INFLUX_PASSWORD=admin - INFLUX_DB=mdt_realtime - INFLUX_DB_CREATE=1 - MDT_DIALIN_HOST_0=xrv9000... 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 31

Q & A 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public

Demo: Telemetry-based vbng address pool management

Demo: Telemetry-based vbng address pool management renew dhcp GigabitEthernet 2.800 IOSXRV9k-1 BNG server GigE 0/0/0/2.800 GigE 2.800 DHCP client DHCP bindings monitoring 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 34

Machine Learning & Big Data

Machine Learning Collect Observe Alert Take action How can you know in advance what your network state will be? Learn Predict events 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 36

Machine Learning In a nutshell Inspired from the human learning process Learn from experience E some task T with performance P Offspring of probabilities, statistics and computer science All about data: an optimized approach to deduce correlation of data points Supervised Learning Labeled Training data Classification Regression Reinforcement Learning Decision process with Reward system Unsupervised Learning Unlabeled Training data Clustering Regression 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 37

Machine Learning Applications Optimization Value prediction Gaming Market stocks Computer Vision Object recognition Recommender Systems Search engines Network security Signal processing Healthcare Artificial Intelligence Brain-computer interfaces Automated speech recognition Bioinformatics sequences Source: Google Trends 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 38

Machine Learning Open questions for a network maintainer When can I upgrade a routing device software How does the load of my network evolve and where What is the timeline for capacity upgrade When do I need traffic routing re-optimization What is the impact of maintaining an optical fibre Can I guarantee bandwidth 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 39

Cisco WAN Automation Engine (WAE) WAE provides tools for Network Optimization and Automation 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 40

Proposal in the context of Network Monitoring Forecasting tool predicting future network traffic, with key characteristics such as trend, seasonality and periodicity as well as localization, such as access, edge and core. Complements Cisco WAE solution. Delivers operational efficiency Allows network optimization Provides services differentiation 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 41

Methodology Main components of the development Explore statistical characteristics Identify roadblocks Iterative process Deep learning Validate the method Data access & understanding Modelling & Training Modelling & Training Traffic forecasting Data Data preprocessing understanding & preprocessing Testing & Forecasting Testing & Forecasting Solve missing data Apply transforms Extract features Assess performance on unseen data Check for robustness and lack of bias 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 42

01/04 03/04 05/04 07/04 09/04 11/04 13/04 15/04 17/04 19/04 21/04 23/04 25/04 27/04 29/04 01/05 03/05 05/05 07/05 09/05 11/05 13/05 15/05 17/05 19/05 21/05 23/05 25/05 27/05 29/05 31/05 02/06 04/06 06/06 08/06 10/06 12/06 14/06 16/06 18/06 Mbps Performance and Forecasting 80,000 70,000 LSTM Forecasting Original Traffic [Mbps] Forecasted Traffic [Mbps] 60,000 50,000 40,000 30,000 20,000 10,000 0 Date 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 43

14/06 15/06 16/06 17/06 18/06 Mbps Performance and Forecasting 70,000 60,000 50,000 40,000 LSTM Forecasting K-fold Training Score 3-month dataset MSE 0.00166 7461821.55 Mbps 2 RMSE 0.04072 2731.63 Mbps 30,000 20,000 10,000 0 Original Traffic [Mbps] Forecasted Traffic [Mbps] Test Score 3-month dataset MSE 0.00135 6805089.28 Mbps 2 RMSE 0.03681 2608.65 Mbps Date 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 44

Outlook Enabled pattern identification of network traffic Reliable anticipation of events in the network A complementary tool for intelligent network adaptation Observe, learn and understand the network state Adapt network based on an intelligent decision-making process 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 45

Big Data Motivation and role We can learn, but how do we improve our speed and scale? Trends Increased network capacity and bandwidth Cloud services Changing traffic patterns Continuous flow of data (application content and services) Areas of Improvement Complexity of the network Policy inconsistency Inability to scale Lack of high-performance data processing Distributed training per network node or interface Grid-based search for a model that optimally captures the features of the data 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 46

Big Data In a nutshell Not only a vast amount of data Various data sources Collection with high velocity Large amounts of data High granularity of the data Application of high-performance algorithms Rather, a set of technologies that support large-scale collection and provide a platform for enabling meaningful insight to data. 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 47

Big Data Solution: Platform for Network Data Analytics (PNDA) PNDA as a data platform Open source Data aggregation with high throughput with Apache Kafka Environment for OSS data storage and exploration Parallel processing for rapid operations Supports batch processing applications Predictive analysis on time-series data http://www.pnda.io/ 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 48

Q & A 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public

Agenda Introduction Demo: Link-quality assurance with Model-Driven Telemetry Streaming Telemetry data from a device to a collector Storing and processing Telemetry data in a time series platform Visualizing data and reacting to events Machine Learning & Big Data use cases Demo: Telemetry-based vbng address pool management Conclusion

Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session in the Cisco Live Mobile App 2. Click Join the Discussion 3. Install Spark or go directly to the space 4. Enter messages/questions in the space cs.co/ciscolivebot# 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public

Please complete your Online Session Evaluations after each session Complete 4 Session Evaluations & the Overall Conference Evaluation (available from Thursday) to receive your Cisco Live T-shirt All surveys can be completed via the Cisco Live Mobile App or the Communication Stations Complete Your Online Session Evaluation Don t forget: Cisco Live sessions will be available for viewing on-demand after the event at www.ciscolive.com/global/on-demand-library/. 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public

Continue Your Education Recreate the first demo in your lab! Get the code at https://github.com/sbyx/telemetry-breakout Try it out by yourself in our instructor-led lab tomorrow from 09.00-13.00! Learn more about Telemetry, Containers and other IOS XR 6.x features and visit @xrdocs on Github! https://xrdocs.github.io Demos in the Cisco campus Walk-in Self-Paced Labs Lunch & Learn Meet the Engineer 1:1 meetings 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 53

Main Message Telemetry takes you from monitoring the network to understanding it. 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 54