Network Traffic Visibility and Anomaly October 27th, 2016 Dan Ellis

Similar documents
Data-Driven Network Opera1ons. France-IX 2016 Avi Freedman

RIPE75 - Network monitoring at scale. Louis Poinsignon

Detecting Hidden Spam Bots (and other tales from the NetFlow front lines) Jim Meehan Director, Product Marketing

Compare Security Analytics Solutions

Flow-Based Network Monitoring using nprobe and ntopng

Mastering Near-Real-Time Telemetry and Big Data: Invaluable Superpowers for Ordinary SREs

Data Sheet. Monitoring Automation for Web-Scale Networks MONITORING AUTOMATION FOR WEB-SCALE NETWORKS -

Imperva CounterBreach

New Data Architectures For Netflow Analytics NANOG 74. Fangjin Yang - Imply

NetFlow Optimizer. Overview. Version (Build ) May 2017

RIPE76 - Rebuilding a network data pipeline. Louis Poinsignon

SaaS Providers. ThousandEyes for. Summary

Scrutinizer Flow Analytics

Introduction to Netflow

plixer Scrutinizer Competitor Worksheet Visualization of Network Health Unauthorized application deployments Detect DNS communication tunnels

Affordable High-Speed Sensors Everywhere. ntop Meetup Flocon 2016, Daytona Beach Jan 13th 2016

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti

All Events. One Platform.

Network Management Automated Intelligence

ThousandEyes for. Application Delivery White Paper

Network Management and Monitoring

SolarWinds Engineer s Toolset Fast Fixes to Network Issues

UX - User Experience: Multi-Cloud Network Visibility

Driving Network Visibility

Trisul Network Analytics - Traffic Analyzer

Enhancing DDoS protection TAYLOR HARRIS SECURITY ENGINEER

PNDA.io: when BGP meets Big-Data

NetFlow Traffic Analyzer

Always Keep IT Purely Simple

The Keys to Monitoring Internal Web Applications

Subscriber Data Correlation

Sourcefire Network Security Analytics: Finding the Needle in the Haystack

LEARN READ ON TO MORE ABOUT:

Flow-based Traffic Visibility

Automated Response in Cyber Security SOC with Actionable Threat Intelligence

Ingest. David Pilato, Developer Evangelist Paris, 31 Janvier 2017

pmacct and Streaming Telemetry

THE RSA SUITE NETWITNESS REINVENT YOUR SIEM. Presented by: Walter Abeson

AMP-Based Flow Collection. Greg Virgin - RedJack

Network Operations Analytics

Ingest. Aaron Mildenstein, Consulting Architect Tokyo Dec 14, 2017

The Why, What, and How of Cisco Tetration

Network Security Monitoring with Flow Data

Cato Cloud. Software-defined and cloud-based secure enterprise network. Solution Brief

NetFlow Traffic Analyzer

CYBER ANALYTICS. Architecture Overview. Technical Brief. May 2016 novetta.com 2016, Novetta

SEVONE DATA APPLIANCE FOR EUE

DDoS Protection in Backbone Networks Deployed at Trenka Informatik AG (

DDoS Protection in Backbone Networks

Unified Communications from West

SolarWinds Engineer s Toolset Fast Fixes to Network Issues

Self-driving Datacenter: Analytics

Beyond Blind Defense: Gaining Insights from Proactive App Sec

PSOACI Tetration Overview. Mike Herbert

SEVONE END USER EXPERIENCE

High-speed Traffic Analysis Using ntopng: The New Features

ARIA SDS. Application

OUTLINE. NSLS-II control system environment Monitoring goals Splunk and Splunk Apps Unix, Nagios, Snort sflow and Cacti Putting it all together

Security, Internet Access, and Communication Ports

RSA Security Analytics

Introduc?on to pmacct

Visual TruView Unified Network and Application Performance Management Focused on the Experience of the End User

Monitor your containers with the Elastic Stack. Monica Sarbu

End-to-End Security Analytics with the Elastic Stack. Samir Bennacer

Cisco Tetration Analytics

Introduction to sflow

Growing DDoS attacks what have we learned (29. June 2015)

Cisco Tetration Analytics

SOLUTION BRIEF RSA NETWITNESS EVOLVED SIEM

Security, Internet Access, and Communication Ports

Interface Utilization vs. Flow Analysis

Sharing is Caring: Improving Detection with Sigma

IBM Aurora Flow-Based Network Profiling System

Security, Internet Access, and Communication Ports

Cisco Stealthwatch Improves Threat Defense with Network Visibility and Security Analytics

Additional Security Services on AWS

Monitoring and Threat Detection

Exam Name: Riverbed Certified Solutions Professional - Network Performance Management

System Requirements. Things to Consider Before You Install Foglight NMS. Host Server Hardware and Software System Requirements

Transforming the Cisco WAN with Network Intelligence

Tetration Hands-on Lab from Deployment to Operations Support

Graphing and statistics with Cacti. AfNOG 11, Kigali/Rwanda

Popular SIEM vs aisiem

Analytics Driven, Simple, Accurate and Actionable Cyber Security Solution CYBER ANALYTICS

Scalability Engine Guidelines for SolarWinds Orion Products

Minimizing the Risks of OpenStack Adoption

THE ACCENTURE CYBER DEFENSE SOLUTION

Features and Functionality

Preventing Data Breaches without Constraining Business Beograd 2016

How can we gain the insights and control we need to optimize the performance of applications running on our network?

AKAMAI CLOUD SECURITY SOLUTIONS

Pluribus UNUM Platform

Barry D. Lamkin Executive IT Specialist Capitalware's MQ Technical Conference v

CISCO NETWORKS BORDERLESS Cisco Systems, Inc. All rights reserved. 1

STATE OF THE NETWORK STUDY

Integrated Web Application Firewall (WAF) & Distributed Denial Of Service (DDoS) Mitigation For Today s Enterprises

Professional PostgreSQL monitoring made easy. Kaarel Moppel Kaarel Moppel

Splunk Review. 1. Introduction

Using Threat Analytics to Protect Privileged Access and Prevent Breaches

BGP Scaling (RR & Peer Group)

Transcription:

Network Traffic Visibility and Anomaly Detection @Scale: October 27th, 2016 Dan Ellis

Introduction Network traffic visibility?

Introduction Network traffic visibility? What data is available on your network What can you do with this data Tools available

Introduction Network traffic visibility? What data is available on your network What can you do with this data Tools available 20+ years running blind (ISP s, CDN s, enterprise) Who is Kentik Goal of this talk: Make your life easier

Traffic Visibility Problem Data networks can be compared to FedEx Imagine FedEx without package tracking Majority of data networks operate in this vacuum of visibility Hard to believe? Problem is massive data scale, lack of tools, little network + systems collaboration

Not Helping 6

A Poll Interface Volume (Mb/s, pps)?

A Poll Interface Volume (Mb/s, pps)? Src/Dst IP+Port, ASN, BGP Path?

A Poll Interface Volume (Mb/s, pps)? Src/Dst IP+Port, ASN, BGP Path? IP, Port, ASN or Path Thresholds?

Is this really a problem?? Maybe there isn t a traffic visibility problem Maybe no one really needs this data

Complaints of high latency BGP Path to Comcast 11

Dyn attack last week ISP recursive inbound 12

Dyn attack last week Traffic / source_ip

Dyn attack last week ISP recursive outbound

Use cases of traffic visibility Network Planning Peering Analytics and Abuse Congestion detection Is it the network? Where on the network? Proactive alerting Distributed DDoS Detection What Changed Post Deploy? Security and Breach Detection Cost Analytics Revenue Identification (New + Risk) Enabling Internal Groups

Tenets Infinite granularity storage for months Drillable visibility, network specific UI Real-time and fast (< 10s queries) Anomaly detection + actions Open / API Scale

Now we know what we need, how do we do it?

Where to find this data? TCP stats data / app specific data App Server PCAP agent Router + Flow data NetFlow, SFlow, IPFIX + SNMP interfaces info NETWORK + Router BGP Path info + Sys/Event logs TACACS or Syslog = Something actually useful

What kind of tools Current Open Source: Older Open Source: Commercial software: pmacct, ntop, SiLK, cacti cflowd, AS-PATH, RRDtool Arbor, Plixer, SevOne, Solarwinds, ManageEngine DIY Big Data: On-Prem Big Data: SaaS Big Data: Kafka + ELK, Hadoop, druid, grafana, tsdb Cisco Tetration, Deepfield Kentik, Datadog, Appneta, Splunk

Many tools gets you almost there 20

Really though (tools) Open source (ish): Pmacct Nprobe / Ntop Elastic Search + Kibana (ELK) Commercial: Arbor Kentik

Three layer approach data sources Ingest & Fusion layer Storage Layer (flow specific) Query Layer clients Query engine and UI Query interfaces SQL SELECT flow FROM router WHERE WWW REST >_ Each layer has separate and different scaling characteristics

Seriously? How much data Small network (< 10Gb/s traf.) Large network (1 Tb/s traf.) Querying over 30+ days 10k flows/sec (+rows/sec) 500k flows/sec @ 200k fps (518 B rows, 207 TB) in < 10s

Data fusion is a key enabler to useful data

DATA FUSION DATA FUSION ROUTER Decoder Modules NetFlow v5 DATA FUSION Mem Tables BGP RIB BGP Daemon PCAP agent Proxy NetFlow v9 IPFIX SFlow PCAP Single flow fused row sent to storage Geo ß à IP ASN ß à IP Custom Tags SNMP Poller Enrichment DB FLOW FRIENDLY DATASTORE

DATA FUSION CLIENT PROCESS ROUTER Proxy Decoder Modules NetFlow v5 NetFlow v9 DATA FUSION Mem Tables BGP RIB Geo ß à IP BGP Daemon IPFIX SFlow ASN ß à IP Custom Tags Enrichment DB Agent Unified Flow Single flow fused row sent to storage SNMP Poller Serious. Flux Capacitor. Is. Serious. FLOW FRIENDLY DATASTORE

Fusing should be: near real-time performed at ingest data specific

Network planning: traffic by BGP hop

Network planning: collapsed path, exclude 1st

Looking at existing architectures out there

PMAcct-based implementation data sources Ingest & Fusion layer Storage Layer Query Layer clients PCAP Flow frontend BGP

Nprobe + Ntop + ElasticSearch data sources Ingest & Fusion layer Storage Layer Query Layer clients nprobe Flow nprobe(s) BGP Flow BGP

Kafka + Elastic Search Dropbox implementation of a (mostly) open-source NetFlow solution here: Dropbox blog Requires custom ingest, fusing, UI

Caveats Ingest: Data-store: Query frontends very generic: Distributing and scaling (1xNProbe = 1xDevice) No SNMP (= no IF info available for fusion) Aggregation (no infinite granularity) Challenging at scale when ES very hard for MySQL/MongoDB Tailoring of meaningful dashboards difficult No anomaly detection Commercial HW solutions (Arbor) Appliance based not truly distributed pre-determined list of aggregated data (no infinite granularity)

And so

Why isn t everybody already doing it? Required areas of expertise (because every presentation needs a Vin diagram) Train all the other teams on the involved network protocols and their usage Network Engineers Distributed systems engineers Resilience / Reliability Geo-distributed ingest Flow friendly data-store Unicorn Low-level Network developers Make all of the above work reliably SREs BGP Daemon Flow inspection & conversion Network protocols hacking

Looking beyond the basics

Building on top of the platform Once you have a platform, what s next? Augmented flow (retransmits, latency, URL, DNS) Anomaly detection Multi-hop exit determination BGP-path congestion detection

Building on top of the platform Imagine if we could get performance data from the network: Q Depth Retransmits per flow TCP latency Application Latency You can: Nprobe (ntop) collects Latency, Rxmits, URL, DNS -> IPFIX flow Deploy on a host or a sensor Cisco, Juniper, Arista working to expose Q Depth into flow

Retransmits enhanced flow: rexmits / interface

Retransmits enhanced flow: rexmits / ASN

Retransmits enhanced flow: TCP latency / ASN

Building on top of the platform You shouldn t have to stare at dashboards or watch logs to detect badness Monitor top-x of any dimension combination (IP, ASN s, Geo, Interface) Create baselines based on time of day Be able to look at things beyond pps/bps such as retransmits, latency, logs Detect shifts: did an ASN or IP on a particular interface suddenly move from top-x #200 to #2 and that is unusual for this time of day This is available today (Open Source: Hadoop, Spark, Storm, Samza, Flink)

Use case: anomaly detection Traffic from one ASN (network) unusually high. Operator notified at red line.

Use case: traffic anomaly detection & annotation

Use case: traffic annotated w/ multiple events

Anomaly detection: DDoS detection & characteristics

Building on top of the platform Once you have a platform, what s next? üaugmented flow (retransmits, latency, URL, DNS) üanomaly detection q Multi-hop exit determination Challenging to map traffic from ingest to exit point, multi-hop q BGP-path congestion detection Detect individual congested paths within a circuit that isn t congested

Summary Networks can produce large amounts of data that will make your life easier Big Data platforms are able to consume this data Specific tools for Network Operators are beginning to appear (free & paid) Paid tools are more specific to network use (UI, easy setup, etc) Free tools have the power but require cobbling together pieces Much work to be done re fusing data such as logs, changes, alerts, DNS SaaS providers will provide community views and enable data-sharing

QUESTIONS? Dan Ellis dan@kentik.com CREDITS