Deep-Q: Traffic-driven QoS Inference using Deep Generative Network

Similar documents
Alternatives to Direct Supervision

Use CVAE in QoS Management

Variational Autoencoders. Sargur N. Srihari

Unsupervised Learning

Introduction to Generative Models (and GANs)

Generative Adversarial Networks (GANs) Ian Goodfellow, Research Scientist MLSLP Keynote, San Francisco

GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin

Deep Generative Models Variational Autoencoders

Autoencoder. Representation learning (related to dictionary learning) Both the input and the output are x

Replacing Neural Networks with Black-Box ODE Solvers

MIND: Machine Learning based Network Dynamics. Dr. Yanhui Geng Huawei Noah s Ark Lab, Hong Kong

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

GAN Frontiers/Related Methods

Network simulations and tools. Dmitry Petrov magister.fi or jyu.fi

Deep Model Adaptation using Domain Adversarial Training

Data Driven Networks

Day 3 Lecture 1. Unsupervised Learning

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Variational Autoencoders

Accurate and Efficient SLA Compliance Monitoring

A new approach for supervised power disaggregation by using a deep recurrent LSTM network

ECS289: Scalable Machine Learning

Exploring the Structure of Data at Scale. Rudy Agovic, PhD CEO & Chief Data Scientist at Reliancy January 16, 2019

Clustering algorithms and autoencoders for anomaly detection

Data Driven Networks. Sachin Katti

Presented by: B. Dasarathy OMG Real-Time and Embedded Systems Workshop, Reston, VA, July 2004

Pixel-level Generative Model

Autoencoding Beyond Pixels Using a Learned Similarity Metric

Early detection of Crossfire attacks using deep learning

MoonRiver: Deep Neural Network in C++

Lecture #17: Autoencoders and Random Forests with R. Mat Kallada Introduction to Data Mining with R

Automated Website Fingerprinting through Deep Learning

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

On Minimizing Packet Loss Rate and Delay for Mesh-based P2P Streaming Services

Measurement in ISP Backbones Capacity Planning and SLA Monitoring. NANOG 26 - October 2002 Tony Tauber Genuity Network Architecture

NetSpeed ORION: A New Approach to Design On-chip Interconnects. August 26 th, 2013

Introduction to GAN. Generative Adversarial Networks. Junheng(Jeff) Hao

Machine Learning 13. week

Performance investigation and comparison between virtual networks and physical networks based on Sea-Cloud Innovation Environment

Lecture 21 : A Hybrid: Deep Learning and Graphical Models

Sequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015

Better Quality, Lower Delay: Improving Realtime Video by Co-designing the Codec and the Transport

Control of jitter buffer size using machine learning

Natural Language Processing

Configuring Cisco IOS IP SLAs Operations

Lecture 9. Quality of Service in ad hoc wireless networks

Phase-Functioned Neural Networks for Motion Learning

Pricing Intra-Datacenter Networks with

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

Neural Networks and Deep Learning

ImageNet Classification with Deep Convolutional Neural Networks

A Quick Guide on Training a neural network using Keras.

Introduction to GAN. Generative Adversarial Networks. Junheng(Jeff) Hao

CSC 578 Neural Networks and Deep Learning

Neural Network Neurons

Model Generalization and the Bias-Variance Trade-Off

Implicit generative models: dual vs. primal approaches

Learning to generate with adversarial networks

Generative Adversarial Text to Image Synthesis

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Deep Learning for Visual Manipulation and Synthesis

Vivaldi: A Decentralized Network Coordinate System. Authors: Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris MIT. Published at SIGCOMM 04

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

Probabilistic Programming with Pyro

Analysis of Space-Ground Integrated Information Network Architecture and Protocol

End-to-End Mechanisms for QoS Support in Wireless Networks

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning. Architecture Design for. Sargur N. Srihari

Lecture 13 Segmentation and Scene Understanding Chris Choy, Ph.D. candidate Stanford Vision and Learning Lab (SVL)

Realistic Performance Analysis of WSN Protocols Through Trace Based Simulation. Alan Marchiori, Lin Guo, Josh Thomas, Qi Han

Slides adapted from Marshall Tappen and Bryan Russell. Algorithms in Nature. Non-negative matrix factorization

Deploying MPLS & DiffServ

Investigating the Use of Synchronized Clocks in TCP Congestion Control

Dynamic Routing Between Capsules

Modeling Sequences Conditioned on Context with RNNs

Efficient Algorithms may not be those we think

GAN and Feature Representation. Hung-yi Lee

Assessing the Nature of Internet traffic: Methods and Pitfalls

Towards a Wireless Lexicon. Philip Levis Computer Systems Lab Stanford University 20.viii.2007

An Efficient Bandwidth Estimation Schemes used in Wireless Mesh Networks

Fast and Accurate Load Balancing for Geo-Distributed Storage Systems

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

A hierarchical network model for network topology design using genetic algorithm

CS839: Probabilistic Graphical Models. Lecture 22: The Attention Mechanism. Theo Rekatsinas

Biometrics Technology: Image Processing & Pattern Recognition (by Dr. Dickson Tong)

A Neuro Probabilistic Language Model Bengio et. al. 2003

Clustering and Unsupervised Anomaly Detection with l 2 Normalized Deep Auto-Encoder Representations

OverSim. A Flexible Overlay Network Simulation Framework. Ingmar Baumgart, Bernhard Heep, Stephan Krause

19: Inference and learning in Deep Learning

Feature Visualization

Gaussian Processes for Robotics. McGill COMP 765 Oct 24 th, 2017

Recurrent Neural Network (RNN) Industrial AI Lab.

Deep Learning for Computer Vision II

Perspectives on Network Calculus No Free Lunch but Still Good Value

Information, Gravity, and Traffic Matrices

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Neural networks for variable star classification

Flat Routing on Curved Spaces

Transcription:

Deep-Q: Traffic-driven QoS Inference using Deep Generative Network Shihan Xiao, Dongdong He, Zhibo Gong Network Technology Lab, Huawei Technologies Co., Ltd., Beijing, China 1

Background What is a QoS Model? Traffic QoS Model Delay, jitter, packet loss Network

Background Why is it important? Online QoS Monitoring SLA guarantee & anomaly detection Delay Monitoring Path A Path B A QoS model helps reduce most of the cost! Path C Monitor Require high cost on real-time active QoS measurements!

Background Why is it important? Online QoS Monitoring SLA guarantee & anomaly detection Delay Monitoring Path A Path B Path C Monitor Offline Traffic Analysis Delay Inference Path A Path B Path C Traffic trace Inference + Network A QoS model can do QoS inference without QoS measurements

Background Why is it important? Online QoS Monitoring SLA guarantee & anomaly detection Delay Monitoring Path A Path B Path C Monitor Offline Traffic Analysis Delay Inference Path A Path B Path C Traffic trace Inference + Network Path A Path C What if Analysis Delay Prediction Predict How QoS will change if a flow switches from Path A to C?

Traditional Methods 1. Network simulator Traffic NS2, NS3, OMNeT++ Network Simulator Delay, jitter, packet loss Network Slow and Inaccurate 6

Traditional Methods 2. Mathematical modeling Traffic Simplified assumptions Queuing Theory Delay, jitter, packet loss Network Large human-analysis cost & Inaccurate 7

Traditional Methods 2. Mathematical modeling Traffic Simplified assumptions Queuing Theory Delay, jitter, packet loss Network Large human-analysis cost & Inaccurate A fast, accurate & low-cost QoS model is helpful! 8

Key Observations Observation 1: Traffic load per link is much easier to collect & wellsupported by existing tools (e.g., SNMP) than QoS values per path 9

Node index Key Observations Observation 1: Traffic load per link is much easier to collect & wellsupported by existing tools (e.g., SNMP) than QoS values per path Observation 2: Traffic load is the key factor of QoS changes Traffic: collected link load matrixes Delay, jitter, packet loss QoS Model Node index 10

Key Observations Observation 3: Different traffic loads lead to different QoS distributions 40 traffic loads (per 20 min) Testbed measurement Measured delay samples 11

Key Observations Target Problem: Given a set of traffic load matrixes during time T, what are the distributions of QoS values (delay, jitter, loss...) of each network path during T? Different traffic loads lead to different QoS distributions 12

Solution of Deep-Q Why deep learning helps? Low human-analysis cost Fast inference Data-driven VS. Human-engineered model Delay model Loss model Packets Running time of Hours! Network Simulator QoS values Auto Training Delay/Jitter/Loss Running time of Milliseconds! Traffic load matrix QoS values 13

Probability Key Technology: Deep Generative Network State-of-the-art DGNs in deep learning Image domain GAN(Generative Adversarial Network) & VAE(Variational Autoencoder) So what is the difference? Network domain Input: this small bird has a pink breast and crown, and black primaries and secondaries infer Input: number 2 infer Input: traffic load matrixes infer Source: ICML2016, Generative Adversarial Text to Image Synthesis (Conditional) GAN Example Source: NIPS2014, Semi-supervised Learning with Deep Generative Models (Conditional) VAE Example Deep-Q Delay (us) 14

Key Technology: Deep Generative Network Differences Application: text label to images Image domain (GAN & VAE) Network domain (Deep-Q) Application: traffic load matrixes to QoS values Input Output Input Output Discrete Label Image samples Traffic statistics QoS values Discrete & Low/high Dimensional Discrete & High Dimensional Continuous & High Dimensional Continuous & Low Dimensional Target: the generated image samples satisfy real image distribution and match the label class Target: the generated QoS values satisfy real QoS distribution and match the traffic statistics Deep-Q requires a high accuracy on the output distribution, but GAN & VAE do not apply! 15

Deep-Q Solution 1. Handle the continuous high-dimensional input Extract traffic features from a sequence of high-dimensional traffic load matrixes LSTM (Long Short Term Memory) module: a state-of-the-art deep learning method to learn features from a data sequence Micro-load matrixes during time t M t 1 2 3 M t M n t M t LSTM Cell Hidden State LSTM Cell Hidden State LSTM Cell Hidden State LSTM Cell Traffic features 16

Cumulative Probability Deep-Q Solution 2. Handle the continuous low-dimensional output Challenge: high accuracy is required for QoS distribution inference Solution: a new metric Cinfer loss to accurately quantify the QoS distribution error X: Inferred QoS distribution Y: Target QoS distribution CDF (Cumulative Distribution Function) CDF curve of X CDF curve of Y Height Difference Delay (ms) 17

Deep-Q Solution Deep-Q: A stable & accurate inference engine Built upon VAE (Stable) and augmented with Cinfer Loss (Accurate) A simple example of learning ability: Target distribution Inferred distribution VAE: Stable but Inaccurate GAN: More accurate but unstable Deep-Q: Stable & Accurate L2 Loss of VAE KL Loss of GAN Cinfer Loss of Deep-Q 18

Cumulative Probability Cumulative Probability Deep-Q Solution Cinfer-Loss computation for training The exact computation is NP-hard The approximation must be fully differentiable to compute gradients for training Step 1: Discretization From integral to a discrete sum of bins Discretization Delay (ms) Delay (ms) 19

Cumulative Probability Deep-Q Solution Cinfer-Loss computation for training The exact computation is NP-hard The approximation must be fully differentiable to compute gradients for training Step 2: Bin Height Computation required to be differentiable An intuitive method: Calculate the located bin index of each sample & Count the sample number per bin Ceil function is non-differentiable & difficult to approximate! Delay (ms) 20

Cumulative Probability Deep-Q Solution Cinfer-Loss computation for training The exact computation is NP-hard The approximation must be fully differentiable to compute gradients for training Step 2: Bin Height Computation required to be differentiable A differentiable method with some math tricks (borrowed from deep learning) Step 1): Use Sign function Step 2): Approximate Sign function with tanh Delay (ms) Approximation error< 10 5 in experiments 21

Deep-Q Solution Put it all together Sampling from N(0,1) Network QoS (delay,jitter, loss) VAE Encoder Z VAE Decoder Network QoS (delay,jitter, loss) X Traffic load matrix along time X Space-time Traffic Features Inference phase of Deep-Q Training phase of Deep-Q Automatic feature engineering & QoS modeling:end-to-end training using Cinfer Loss Collect traffic data Underlay network 22

Experiment Setup Testbed Topology Experiment topology of data center network Experiment topology of overlay IP network NEU200 Probe NEU200 Probe CPE CPE NEU200 Probe NEU200 Probe Traffic traces: WIDE backbone network [1] Training set: 24 hours of traffic traces on April 12, 2017 Test set: 24 hours of traffic traces on April 13, 2017 r0 AS-1 r1 Internet r2 AS-2 r3 Neural network: TensorFlow implementation with 2 hidden layers r4 [1] Traffic traces are public available at http://mawi.wide.ad.jp/mawi/ 23

Experiment Results Delay Inference in Datacenter Topology Traffic Real delay Distribution error of inference Mean error of inference 90-percentile error of inference 99-percentile error of inference Queuing theory Deep learning 1. Deep learning methods achieve on average 3x higher accuracy over Queuing theory 2. Deep-Q achieves the lowest errors and most stable performance over all cases 24

Experiment Results Packet Loss Inference in Overlay IP Topology Queuing theory Deep learning 1. Deep learning methods achieve on average 3x higher accuracy over Queuing theory 2. Deep-Q achieves the lowest errors and most stable performance over all cases Deep-Q inference speed < 10ms for network scale < 200 nodes 25

Conclusion Deep-Q: an accurate, fast and low-cost QoS inference engine Automation: LSTM module for auto traffic feature extraction High stability: an extended VAE inference structure with the encoder and decoder High accuracy: a new metric Cinfer loss to accurately quantify the QoS distribution error Future vision: Learn device-level QoS models (routers/switches) scalable network-level QoS models Learn high-level application QoE from traffic traces 26

27