Machine Learning in WAN Research

Similar documents
Machine Learning in WAN Research

in High-Speed Networks

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

Applications of Machine Learning and Intelligent Algorithms for SDN and NFV

TensorFlow: A System for Learning-Scale Machine Learning. Google Brain

NVIDIA DEEP LEARNING INSTITUTE

Why data science is the new frontier in software development

Machine Learning with Python

A Survey And Comparative Analysis Of Data

NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS

2015 The MathWorks, Inc. 1

Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell

Demystifying Machine Learning

Contents PART I: CLOUD, BIG DATA, AND COGNITIVE COMPUTING 1

Algorithm-Data Driven Optimization of Adaptive Communication Networks

Knowledge-Defined Networking: Towards Self-Driving Networks

Deep Learning mit PowerAI - Ein Überblick

Review: The best frameworks for machine learning and deep learning

Demystifying Deep Learning

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

DEEP LEARNING AND DIGITS DEEP LEARNING GPU TRAINING SYSTEM

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

GUNREAL: GPU-accelerated UNsupervised REinforcement and Auxiliary Learning

Data Science Bootcamp Curriculum. NYC Data Science Academy

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

A Deep Learning primer

An Introduction to Deep Learning with RapidMiner. Philipp Schlunder - RapidMiner Research

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

Slides for Data Mining by I. H. Witten and E. Frank

DIGITS DEEP LEARNING GPU TRAINING SYSTEM

IBM Leading High Performance Computing and Deep Learning Technologies

AI/ML IRL. Joshua Eckroth Chief Architect / Assistant Professor of Computer Science i2k Connect / Stetson University

ML 프로그래밍 ( 보충 ) Scikit-Learn

Machine Learning Techniques for Data Mining

Python With Data Science

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR

DEEP NEURAL NETWORKS AND GPUS. Julie Bernauer

Flow-based Anomaly Intrusion Detection System Using Neural Network

Deep Learning with R. Francesca Lazzeri Data Scientist II - Microsoft, AI Research

BAYESIAN GLOBAL OPTIMIZATION

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

MIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT. Kurtis McBride CEO, Miovision

Practical Machine Learning Agenda

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan

Neural Network Exchange Format

Machine Learning Techniques at the core of AlphaGo success

Introduction to Deep Learning in Signal Processing & Communications with MATLAB

Deep Learning Inference on Openshift with GPUs

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. Kim Hazelwood Facebook AI Infrastructure

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Machine Learning In A Snap. Thomas Parnell Research Staff Member IBM Research - Zurich

Demystifying Deep Learning

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Machine Learning Workshop

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

Managing Deep Learning Workflows

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

DEEP LEARNING WITH GPUS Maxim Milakov, Senior HPC DevTech Engineer, NVIDIA

Data Mining: STATISTICA

Regression Based Cluster Formation for Enhancement of Lifetime of WSN

This document (including, without limitation, any product roadmap or statement of direction data) illustrates the planned testing, release and

Free Learning OpenCV 3 Computer Vision With Python - Second Edition Ebooks Online

D B M G Data Base and Data Mining Group of Politecnico di Torino

Routing in packet-switching networks

Deep learning in MATLAB From Concept to CUDA Code

The Load Balancing Research of SDN based on Ant Colony Algorithm with Job Classification Wucai Lin1,a, Lichen Zhang2,b

ABSTRACT I. INTRODUCTION. Dr. J P Patra 1, Ajay Singh Thakur 2, Amit Jain 2. Professor, Department of CSE SSIPMT, CSVTU, Raipur, Chhattisgarh, India

Voice, Image, Video : AI in action with AWS. 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

T chnology chnology Ma turity turity for fo Adaptiv Adaptiv Massively Massiv ely Pa P ra r llel llel Computing F rst rst Wo W rksho p 2009

Using CNN Across Intel Architecture

Making Sense of Artificial Intelligence: A Practical Guide

Machine Learning in the Process Industry. Anders Hedlund Analytics Specialist

Intelligent Edge Computing and ML-based Traffic Classifier. Kwihoon Kim, Minsuk Kim (ETRI) April 25.

DATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:

Software-Defined Networking (SDN) Overview

Big Data and FrameWorks; Perspectives to Applied Machine Learning

Adversarial Machine Learning An Introduction. With slides from: Binghui Wang

Scaling Distributed Machine Learning

Smart Home Network Management with Dynamic Traffic Distribution. Chenguang Zhu Xiang Ren Tianran Xu

Five Trends Leading to Opportunities in Multi-Cloud Global Application Delivery

Tutorial on Machine Learning Tools

Name of the lecturer Doç. Dr. Selma Ayşe ÖZEL

A Scalable High-Performance Active Network Node

A NEW COMPUTING ERA JENSEN HUANG, FOUNDER & CEO GTC CHINA 2017

15-744: Computer Networking. Data Center Networking II

Application of Deep Learning Techniques in Satellite Telemetry Analysis.

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Predicting Service Outage Using Machine Learning Techniques. HPE Innovation Center

Effect of Principle Component Analysis and Support Vector Machine in Software Fault Prediction

DCBench: a Data Center Benchmark Suite

Core ML in Depth. System Frameworks #WWDC17. Krishna Sridhar, Core ML Zach Nation, Core ML

Mobile AI. Mérouane Debbah Mathematical and Algorithmic Sciences Lab, Huawei, France. 11 h of June, 2018

Dynamic Analytics Extended to all layers Utilizing P4

Practical Applications of Machine Learning for Image and Video in the Cloud

Keywords Traffic classification, Traffic flows, Naïve Bayes, Bag-of-Flow (BoF), Correlation information, Parametric approach

Deep Learning. Volker Tresp Summer 2015

Data Science Course Content

Transcription:

Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017

Outline ML in general ML in network research Literature Review of research from [2010 - Sept 2017] of ML algorithms in WANs Common areas, data involved, what problems solved Road Ahead (unexplored areas)

AI, ML, DL What s the Difference? Courtesy Nvidia Blog Turing Can Machines Think Turing Test : Exhibit human-like intelligence Machine learning is collection of algorithms that can help achieve AI e.g spam filters, HR hiring, etc Deep learning is one of these ML techniques Recent advances due to GPU and HPC processing (previously very slow, too much data, need training to work) Mainly for image and speech recognition commercial apps

AI Tree (example techniques) only a subset are ML algorithms Optimization technique Evolutionary algorithms (Genetic algorithms, evolutionary strategies, etc) Swarm intelligence (ant colony, etc) AI Expert systems ML: Where ever training or learning on statistical data Fuzzy systems Neural Networks Networks : graph algorithm (routing shortest path) Convolutional networks Deep belief networks Deep boltzman networks Random Forrest, Many more. Clustering, etc Stacked autoencoders

5 Algorithms chosen depending on - data available - problem being solved - combining multiple techniques (some 50% accuracy, others 80% accuracy)

Example: Choosing Algorithms for Problems (e.g. deep learning or DNNs) Deep neural network Feed forward neural network Recurrent neural network Input Data Applied for Variants Hierarchical data representations Sequential data representation (i.e. time series data) General classification Clustering Anomaly finding Feature extraction Sequential learning (when time relationship exists) Deep belief networks (uses restricted boltzman machine for activation function) Convolutional neural networks Long short term memory (LTSM) used for speech translation There are many variants of DNNs. Papers and researchers in each specific DNN. DeepMind used Deep Q-learning for Attari and Go Action-pairs based on learned data.

Multiple Tools Available (DNN Libraries) Toolkit Language Use Processing capability Caffe C++ Images and video Distributed (HPC, GPU) TensorFlow Python Images, regression, video, text, speech Distributed (HPC, GPU) Theano Python Images Distributed (HPC, GPU) Torch Lua Images and speech Distributed (HPC, GPU) Google s DNN platform TensorFlow used to tag unlabeled videos, recognize images with 70% accuracy and predict Gmail replies Scikit-learn good for learning, python library HPC innovation: analyze massive data sets Model and data parallelism to reduce the training time DNNs mostly used in image analysis

Bringing it back to Networks (Reviewing papers since 2010)

Recommended Machine learning Use cases (IETF forums) Network Security Normal and outlier behaviors in traffic Change or predict possible behavior This <QoS value> will cause this <event Y> with probability <P> Bug detection Software or hardware faults WAN path optimization Anticipate congestion Divert traffic to alternate paths

Conducted a Systematic Literature Review Step 1: Identify research questions Step 2: Identify a search string Wide area networks AND (estimate OR predict) AND (learning OR data mining OR artificial intelligence OR pattern recognition OR regression OR classification OR optimization) Step 3: Identify relevant libraries, journals, papers IEEE Xplore, ACM Digital Library, ScienceDirect, Web of Science, EI Compendex, and Google Scholar Step 1: Research questions Step 2: Search strategy Step 3: Study selection criteria Step 3: Quality assessment Relevant papers

But too many papers found Space was too large: WAN are complete systems Have multiple layers (e.g. see picture) Multiple WAN problems Solution Lets organize the results based on : Create categories of similar problems Explore ML and non-ml solutions Which data sets were used

Grouping Problems into 4 Categories User traffic data Infrastructure traffic data User traffic (directed flows) WAN Topology (traffic engineering) (flow-level, traffic prediction, adaptation, path optimization, link failure) (Packet-level, queues, TCP, UDP) Infrastructure-level modifications (Switches, deployment, etc) 12

1) User traffic optimization Traffic prediction Path optimization Machine learning approaches in WAN networks 2) Topology Engineering 3) Packet level optimizations Traffic adaptation TCP specific problems Fault finding Scheduling, congestion Note: SDN related in (2, 3, 4) 4) Infrastructure optimization Controller placements Switch configurations Actual Actions on the WAN Multiple data center connectivity

Results

Relevant Papers: Statistics IEEE Explore #25 Note: Google scholar gave many irrelevant results and is not regarded as a good publication search tool. ACM pub #532 Web of Science #3 #223 #188 Science Direct #10 Remove duplications Apply selection criteria Search additional relevance through references Remove surveys Apply quality assessment

Results per year (1) 30 25 20 ML Non-ML No. of papers 15 10 5 0 2010 2011 2012 2013 2014 2015 2016 2017 Rise of ML techniques in 2017 (Workshops at SigComm, HotNets, etc)

No. of papers Results per category (2) 60 50 40 30 20 10 ML Non-ML 0 User Traffic Traffic Engineering Packet-level improvements Non-ML still largely favored Most ML techniques are used for classification (of traffic) and prediction (failures) Techniques coupled with OpenFlow: Perform classification and configure packets Some tools are enhanced by ML embedding for decision making: Traffic awareness and security problems Forming topologies, optimum path finding Improve path utilizations depending on arriving traffic Optimizing infrastructure

Techniques used Cat 1: User traffic analysis Cat 2: Traffic engineering Cat 3: Packet optimization Cat 4: Optimize infrastructure ML Classification, Regression Naïve Bayes theorem, decision trees, SVM, Random Forest, ANN Regression and classification techniques SVR, decision trees, naïvebayes Regression and classification techniques Non-ML Rule-based learning, statistical analysis techniques Graph opt min cost, greedy search, SPF Fairness computations, path finding game theory, Markov models, simulations Simulation, greedy algorithms for resource allocation

Cat 1: User traffic analysis Use cases Intrusion detection Traffic profiling Cat 2: Traffic engineering Classify flows to form optimum topologies Cat 3: Packet optimization Path performance Classification X X X X Regression X X X Clustering Dimension reduction Anomaly detection Feature learning Coupling with devices X X Demo using simulations X Cat 4: Optimize infrastructure Optimum connections between data centers X

Data Involved Range from packet data, path properties, IP addresses, QoS, TCP/UDP traces, etc Use cases Focus Data set used Category 3: Packet-level optimization VM resources Fairness schemes, MTTF, MTTR, Netflow Category 4: Infrastructure optimization Flow tables, controller placements No. of jobs running, VM data, CPU usage, Application data E.g. Google s B4 optimizes topology to SD WAN (based on demand, packet loss, utilization)

Road Ahead

Lost of Areas still Under-developed Networks are mostly graph optimization problems: Applying ML techniques is unique Reinforcement Learning Agent State s policy π θ (s, a) Take action a Identify what we want to achieve along the pipeline: parameter θ Understanding (Classification) Prediction Action Link with devices (SDN, NFV, etc), but what are the knobs we can alter? ML research focuses on game strategies. We don t have similar strategies in networks!

Breaking Down ML Blackbox Rather than One have multiple algorithms Working with heterogeneous data sets feature learning Computational costs of data processing and model training Using HPC/ GPU to all models to learn Not to treat ML as a black box, but understand why

Conclusions AI shows some promise: Learn, Try, Fail, Learn, Try, Succeed! Mix of Skills: Networks + ML + HPC + (complex workflows) Combining techniques (and algos) to advance research in explored: New areas in network and perhaps even more Opening and sharing data sets/techniques for research (R&E network community)

Any questions/comments? Looking to become a postdoc, please contact Thankyou! MKiran@es.net Funded under DOE Panorama Project (2017-2019), DOE ASCR (2017-2022)