Multi VIM/Cloud Evolvement for Carrier Grade Support

Similar documents
Enabling Workloads Orchestration in Containers via MultiCloud

Towards a Carrier Grade ONAP Platform Multi Cloud (MC) Architecture for R2+ & Alignment to S3P

Towards a Carrier Grade ONAP Platform FCAPS Architectural Evolution

ONAP container. For architecture subcommittee review November 14, 2017 Isaku Yamahata. Isaku Yamahata

Service Orchestration- Need for Users

MSB to Support for Carrier Grade ONAP Microservice Architecture. Huabing Zhao, PTL of MSB Project, ZTE

vcpe Use Case Review Integration Project and Use Case Subcommittee July 25, 2017

Model Driven Orchestration with TOSCA and ARIA

Orchestration and Management for Edge Application with ONAP

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

ONAP Micro-service Design Improvement. Manoj Nair, NetCracker Technologies

Integrating External Controllers with ONAP. AT&T Labs

ONAP Integration Through Information and Data Modeling. ONAP Information Integration GOAL. 12 December 2017

PNF and Hybrid Services Support in ONAP

Akraino & Starlingx: A Technical Overview

Airship A New Open Infrastructure Project for OpenStack

Akraino & Starlingx: a technical overview

Abinash Vishwakarma(Netcracker)

ONAP VoLTE Use Case Solution Brief

ONAP CCVPN Blueprint Overview. ONAP CCVPN Blueprint Improves Agility and Provides Cross-Domain Connectivity. ONAP CCVPN Blueprint Overview 1

Kubernetes introduction. Container orchestration

Container Orchestration on Amazon Web Services. Arun

VoLTE E2E Service Review

Hillstone CloudEdge For Network Function Virtualization (NFV) Solutions

Launching StarlingX. The Journey to Drive Compute to the Edge Pilot Project Supported by the OpenStack

Virtual Network Functions Life Cycle Management

Leveraging OPNFV test tools beyond the NFV domain. Georg Kunz, Emma Foley & the OPNFV testing community

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

DEPLOYING NFV: BEST PRACTICES

VMWARE PKS. What is VMware PKS? VMware PKS Architecture DATASHEET

Virtual Network Functions Life Cycle Management

Important DevOps Technologies (3+2+3days) for Deployment

An Introduction to Kubernetes

Service Mesh and Related Microservice Technologies in ONAP

Service Mesh and Microservices Networking

Dell EMC Ready Solution for VMware vcloud NFV 3.0 OpenStack Edition Platform

How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud. Santa Clara, California April 23th 25th, 2018

VMWARE ENTERPRISE PKS

CONTAINERS AND MICROSERVICES WITH CONTRAIL

Kubernetes made easy with Docker EE. Patrick van der Bleek Sr. Solutions Engineer NEMEA

VMWARE PIVOTAL CONTAINER SERVICE

70-532: Developing Microsoft Azure Solutions

DevOps Technologies. for Deployment

Full Scalable Media Cloud Solution with Kubernetes Orchestration. Zhenyu Wang, Xin(Owen)Zhang

Table of Contents HOL EMT

Securing Microservice Interactions in Openstack and Kubernetes

Architecting a vcloud NFV Platform R E F E R E N C E A R C H I T E C T U RE V E R S I O N 2. 0

Kubernetes: Twelve KeyFeatures

TITANIUM CLOUD VIRTUALIZATION PLATFORM

Microservice Bus Tutorial. Huabing Zhao, PTL of MSB Project, ZTE

Architecting a vcloud NFV OpenStack Edition Platform REFERENCE ARCHITECTURE VERSION 2.0

ENTERPRISE-GRADE MANAGEMENT FOR OPENSTACK WITH RED HAT CLOUDFORMS

What s New in Red Hat OpenShift Container Platform 3.4. Torben Jäger Red Hat Solution Architect

Table of Contents HOL-1786-USE-1

Microservice Powered Orchestration

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

Kuberiter White Paper. Kubernetes. Cloud Provider Comparison Chart. Lawrence Manickam Kuberiter Inc

Real-life technical decision points in using cloud & container technology:

EASILY DEPLOY AND SCALE KUBERNETES WITH RANCHER

TOSCA Templates for NFV and network topology description

Developing Microsoft Azure Solutions (70-532) Syllabus

CORD Roadmap. Release Management. #OpenCORD

Container based network service/function deployment

OpenShift 3 Technical Architecture. Clayton Coleman, Dan McPherson Lead Engineers

Harmonizing Open Source and Standards: A Case Study of ONAP

Edge Computing Operations:

Windows Azure Services - At Different Levels

Convergence of VM and containers orchestration using KubeVirt. Chunfu Wen

Running MarkLogic in Containers (Both Docker and Kubernetes)

Kubernetes. An open platform for container orchestration. Johannes M. Scheuermann. Karlsruhe,

Cisco Virtualized Infrastructure Manager

Developing Microsoft Azure Solutions (70-532) Syllabus

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Illustrative Sequence Diagrams for Residential Broadband vcpe Use Case

Bringing Security and Multitenancy. Lei (Harry) Zhang

TEN LAYERS OF CONTAINER SECURITY

Learn. Connect. Explore.

Taming your heterogeneous cloud with Red Hat OpenShift Container Platform.

Using Future OSS Orchestration to enhance operations and service agility

Kubernetes 101. Doug Davis, STSM September, 2017

OPENSTACK Building Block for Cloud. Ng Hwee Ming Principal Technologist (Telco) APAC Office of Technology

Auto-Scaling Capability Support in ONAP

Build Cloud like Rackspace with OpenStack Ansible

Cisco Enterprise Cloud Suite Overview Cisco and/or its affiliates. All rights reserved.

VNF on-boarding CMCC

"Charting the Course... H8Q14S HPE Helion OpenStack. Course Summary

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

NFV Infrastructure for Media Data Center Applications

WHITE PAPER. RedHat OpenShift Container Platform. Benefits: Abstract. 1.1 Introduction

Building Kubernetes cloud: real world deployment examples, challenges and approaches. Alena Prokharchyk, Rancher Labs

Onto Petaflops with Kubernetes

OpenShift on Public & Private Clouds: AWS, Azure, Google, OpenStack

5G Core Network - ZTE 5G Cloud ServCore. Zhijun Li, ZTE Corporation

Cloud Systems 2018 Training Programs. Catalog of Course Descriptions

Triangle Kubernetes Meet Up #3 (June 9, 2016) From Beginner to Expert

Project Calico v3.1. Overview. Architecture and Key Components

Beyond 1001 Dedicated Data Service Instances

Exam : Implementing Microsoft Azure Infrastructure Solutions

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Managing Openstack in a cloud-native way

Transcription:

Multi VIM/Cloud Evolvement for Carrier Grade Support VMWare: Xinhui Li, Ramki Krishnan, Sumit Verdi China Mobile: Lingli Deng, Chengli Wang, Yuan Liu, Yan Yang ATT: Bin Hu Wind River: Gil Hellmann, Bin Yang Huawei: Gaoweitao, Jianguo Zeng, Chris Donley

Retro to R1 Use case and integration support: - Keystone V2 vs. V3 - Polling based alert emit - Unknown VM status and standalone event system - Runtime Only - Concurrency Multi Cloud for S3p - Improvement of Keystone Proxy - FCAPS modeling - Event/Alert/Meters Federation - Runtime + onboarding - Framework Scalability - Platform awareness - Workload orchestration

Alert/Event/Meter Federation Framework Different Cloud Backend -> different Alerts/Events/Meters, two roles: - Multi Tenancy workload - Admin Motivation: - Despite Alerts, Events and metrics are needed from VIM controller Close the Service Resilience loop in ONAP needs underlying events (from not only resources but also vim controller) Close the Auto-Scaling Resilience loop in ONAP needs underlying Meters/Alerts etc. - Aligning/translating FCAPS modeling is needed from different VIM controllers - Framework enhancement is needed Allowing for meaningful close loop for handling time-sensitive event/alerts Allowing streaming mechanism for handling accuracy-sensitive metrics Extending ves agent implemented in R1 which only emit vm abnormal alert for more period data collection - Visualization on Portal

FCAPS Modeling Different categories of Data: resource status, alerts, performance, and so on - Interface protocol - Data format - Metrics - Logs All the data should be gotten based on the same pre-condition - Period or event? - NTP and facilitation requirements - Delay sensitive - Different data sources Need spec and modeling efforts to formulate these detentions

Alert/Event/Meter Federation Framework Different Cloud Backend -> different Alerts/Events/Meters, two roles: - Multi Tenancy workload - Admin Use cases: - Close the Service Resilience loop in ONAP needs underlying events(from not only resources but also vim controller) - Fcaps propose the requirements for monitoring and management of Alerts/Events/Meters - Visualization on Portal

Measurement Class Measurement class represent the metrics collected from various of sources. It could carry the cpu metrics, memory metrics or fs metrics collected in a period of time.

Event Class Event class represent the VIM specific events. For example, Virtual Machine poweroff events, Port creation/deletion events.

Alarm Class Alarm class represent the alarms triggered by VIM. There are some basic alarms like: Fault Alarm Threshold Crossing Alarm State Change Alarm

Alert/Event/Meter Federation Framework ONAP Message/Data Bus Layer Multi- Cloud OpenStack pub Control Plane for Event/Alert/Streaming/Agent Services Event Service Alert Service Streaming Service Agents Listen DCAE Policy Portal sub sub sub sub.. DMaaP VMware Wind River Kubernetes Other controllers Azure VES Collector query Event Service - Federate events from different VIM providers with ONAP Message/Data bus services - Allow to be configured by the control plane about listener and endpoints - Not only events of different backend clouds, but also events from VIM controllers Streaming Service - Federate meters from different VIM providers with ONAP Message/Data bus services - Allow to be configured by the control plane about gate rate and water mark - Achieve a ideal long term output rate which should be faster or at least equal to the long term data input rate Alert Service - Alert comes from meters or events, or pre-defined in different backend Clouds - Allow to be customized

Operational Intelligence Federation Framework ONAP Message/Data Bus Layer Multi- Cloud framework Policy Portal OF DCAE Event Service sub Plugin pub DMaaP Extensible API/Service Publish and Execution Engine Rest OpenStack Services sub Alert Service Engine Scheduler sub Streaming Service Plugin ONAP NFVI Collector Policy Rest/SNMP VMware vrealize Operations Manager Extensible MC framework Pluggable framework Intelligence contained Information models standardized Customizable to work with upper layers Policy-driven data distribution Compensate each other with different Telemetry: OpenStack control plane service status Underlying OS/hypervisor status Customizable by policy

Alert/Event/Meter Federation Flow Chart Multi Cloud DMaap DCAE Holmes Policy Controllers Multi Cloud Listening to VIM events pub sub VIM policy driven healing actions sub pub sub recover stop VM pub sub start VM VM actions

Multi VIM/Cloud Framework Improvement Performance improvement Multi- Cloud Bootstrap process fork Portal Other controllers Extensible API Framework: API Handler Event Handler Alert Handler SDC VNFSDK http://<ip address>:<port> Image manageme nt write AAI Log File - The number of Handler Engines is configurable, and is the number of CPU cores by default. - Each Handler Engine will use scheduler to do multiple jobs. Stability improvement - Bootstrap process will watch handler engine and respawn it if any exits unexpectedly. - Handler engines are run as process so that memory and resources are isolated. Scheduler ServiceEngine1 Handler Engine 2 Handler Engine N Engine pool - Store log into files Scalability improvement Engine Scheduler - Bootstrap process uses fork to create multiple Handler Engines OpenStack.. VMware Wind River Kubernetes Azure - Multiple Handler Engines share a single socket with bind to the <IP address>:<port> of API server - Extend northbound API by using yaml file

Support Onboarding VNF on-boarding and upgrade are: - Current No formal way for a VNF to ask for resources and ensure they can be and will be granted at install and reserved thru operation - Limited security, isolation, scalability, self healing VIM & VNF specific - Efficiency: in many cases, no other VNF runs on the same platform - Operator, VNF vendor and VIM specific procedures are specifically tailored to each VIM infrastructure, operator and VNF vendor Multi Cloud will help: - Automatically ensure capacity and capability of required VIM & infrastructure resources - Verify/Validate that VIM Image(qcow2 or vmdk) VIM requirements can be met

Onboarding Vendor Valid package Lifecycle Test Pass LC Test Pass Func Test More Test? RefRep (Market place) Upload packag e RunTime Component Function Test Download/ Query Package Validation Robot Test Framework Image Persistence(Different State) /Runtime Infra/etc (Multi-Cloud) Operator/SDC

Multi VIM/Cloud Extention for Onboarding ONAP Message/Data Bus Layer Multi- Cloud OpenStack Metadef Portal DMaaP SDC.. VNFSDK Event Service VMware Wind River Kubernetes FileSystem Other controllers Control plane for Resource Insurance VSAN Ensure Ceph upload Verify Copy and launch Image manageme nt Azure Registry AAI Control Plane for Image Service Image Service Swift Image service to handle - Different kinds of storage for image file and meta reservoir - Support both Docker or VM mages - Help VNF onboarding and runtime optimization Primary operations - Image upload/download - Image registry/un-registry - Copy from one VIM to another VIM - Copy from one data store to another data store - Launch instances Transparent to the up layer - Auto handling across vim copy and capacity management

NFVI Host NUMA resource exposure NUMA awareness - Reduce memory access latency - Fit VM workload NUMA requirement to available NFVI Host NUMA resource NUMA requirement from VM workload - How many NUMA nodes - vcpu list for each NUMA node - Memory size for each NUMA node NUMA resource of the NFVI - NFVI Host list with NUMA awareness capability - Number of NUMA nodes - pcpu list for each NUMA node: total and available - Memory size/range for each NUMA node: total and available

Mismatching NUMA requirement to NUMA resource, case 1 VNF usually consumes big chunk of memory Un-balanced usage of NUMA resource in NFVI hosts Waste usage of memory usage What usually observed: The avail. memory are more than request, but no NFVI host could be placed for a VNF Case 1: Single NUMA node request memory more than avail. memory of any single NUMA node resource, but less than total avail. Memory resource VM 1 WORKLOAD NFVI HOST 1 NUMA node 1: pcpu: 3.75 Avail. memory: 18G NUMA node 1: vcpu: 8 memory: 32G NUMA node 2: pcpu: 5.15 Avail. memory: 30G

Mismatching NUMA requirement to NUMA resource, case 2 There is no single predefined NUMA requirement fit for dynamically changed Case 1: Single NUMA node request memory more than avail. memory of any single NUMA node resource, but less than total avail. Memory resource VM 1 WORKLOAD VM 2 WORKLOAD NUMA resource NUMA node 1: vcpu: 4 memory: 16G NUMA node 2: vcpu: 4 memory: 16G NUMA node 1: vcpu: 4 memory: 4G NUMA node 2: vcpu: 4 memory: 4G NFVI HOST 1 NUMA node 1: pcpu: 3.75 Avail. memory: 18G NUMA node 2: pcpu: 5.15 Avail. memory: 30G

ONAP MultiCloud Tune the NUMA requirements MultiCloud tune the NUMA requirement to fit the VM workloads into NFVI Host NUMA resource NFVI Host NUMA resource must be exposed/updated into ONAP data store VM 1 WORKLOAD NUMA node 1: vcpu: 4 memory: 16G MultiCloud EPA matching logic pass-through NUMA node 2: vcpu: 4 memory: 16G VM 2 WORKLOAD NUMA node 1: vcpu: 4 memory: 4G VM 2 WORKLOAD NUMA node 1: vcpu: 8 memory: 8G NUMA node 2: vcpu: 4 memory: 4G Transform NFVI HOST 1 NUMA node 1: pcpu: 3.75 Avail. memory: 18G NUMA node 2: pcpu: 5.15 Avail. memory: 30G

Enabling Cloud Native Workloads Orchestration via MultiCloud Service Orchestrator Controllers Cloudify Plugin for MultiCloud MultiCloud Container (e.g. K8S) Plugin Global Orchestration and Close-Loop Automation Cloud Native Infrastructure Kubernetes Cluster on Bare Metal Hardware Resources etcd Scheduler Kubelet Kubelet API Server Pod Pod Controller Manager Kubernetes Master Master Node Pod Pod Kubernetes Workloads Worker Node

Enabling Cloud Native Workloads Orchestration via MultiCloud MultiCloud North Bound Components (Consumers) Atomic Resource Definition: - Define new data and/or node types to represent individual cloud resources in cloud-agnostic fashion - Custom data/node types are included for EPA etc. Cloudify Plugin for MultiCloud: - A generic, thin API proxy - Define the superset of all properties of each cloudagnostic node type - A single service template can support multiple clouds by model + inputs ONAP Operations Manager (OOM) Cloudify Plugin for MultiCloud The actual interaction with MultiCloud APIs MultiCloud Controllers SDN-C Service Orchestrator Generic VNF Controller Template-level orchestration, e.g.: - Manages the templates in the Catalog, handles the mapping of input parameters, issues the necessary sequence of calls to MultiCloud Layer to perform the required operations, handle success and error responses, Cloud-specific South Bound Interface, e.g. Kubernetes Plugins, etc. Optimization Framework A&AI TOSCA Model Atomic Resource Definition Deploy and Operate with atomic resource level, minimum orchestration, e.g: - The arbiter of which nodes and/or properties to use for the targeted cloud - Use per-node orchestration logic (e.g.via separate micro-services ) - VIM-specific path would look at the properties they know about, and ignore the others.

Backup Slides

Enabling Cloud Native Workloads Orchestration via MultiCloud Decoupling template level operations from atomic resource level operation Atomic Resource Definition - TOSCA data and/or node types to represent the individual cloud resources in cloud-agnostic fashion. It also includes custom data and/or node types per cloud type - Define the superset of all properties of each cloud-agnostic node type Template-level Orchestration - A single template can support multiple clouds because of model + inputs with runtime resolution - Orchestration is primarily done within TOSCA Orchestrator on the north of MultiCloud Layer - E.g.: manages the templates in the Catalog, handles the mapping of input parameters, issues the necessary sequence of calls to MultiCloud Layer to perform the required operations, handle success and error responses, etc. Minimum Orchestration at Atomic Resource Level in MultiCloud Layer - MultiCloud Layer would be the arbiter of which nodes and/or properties to use for the particular target cloud - Per-node orchestration logic can be used in MultiCloud Layer (e.g.via separate microservices ), and the VIM-specific paths would look at the properties they know about, and ignore the others.

Enabling Cloud Native Workloads Orchestration via MultiCloud Service Orchestrator tosca_definitions_version: tosca_simple_yaml_1_0 topology_template: description: Template of an application connecting to a database. node_templates: web_app: type: tosca.nodes.webapplication.mywebapp requirements: - host: web_server - database_endpoint: db web_server: type: tosca.nodes.webserver requirements: - host: server server: type: tosca.nodes.compute # details omitted for brevity db: # This node is abstract (no Deployment or Implementation artifacts on create) # and can be substituted with a topology provided by another template # that exports a Database type s capabilities. type: tosca.nodes.database properties: user: my_db_user password: secret name: my_db_name Example cited from [TOSCA-Simple-Profile-YAML-v1.1]. Copyright OASIS Open 2017. All Rights Reserved.

Enabling Cloud Native Workloads Orchestration via MultiCloud tosca_definitions_version: tosca_simple_yaml_1_0 Service Orchestrator topology_template: description: Template of a database including its hosting stack. inputs: db_user: type: string db_password: type: string # other inputs omitted for brevity substitution_mappings: node_type: tosca.nodes.database capabilities: database_endpoint: [ database, database_endpoint ] node_templates: database: type: tosca.nodes.database properties: user: { get_input: db_user } # other properties omitted for brevity requirements: - host: dbms dbms: type: tosca.nodes.dbms.mysql # details omitted for brevity server: type: tosca.nodes.compute # details omitted for brevity Example cited from [TOSCA-Simple-Profile-YAML-v1.1]. Copyright OASIS Open 2017. All Rights Reserved.

Enabling Cloud Native Workloads Orchestration via MultiCloud Service Orchestrator Cloudify Plugin for MultiCloud: The actual interaction with MultiCloud APIs - Issues the sequence of calls to MultiCloud Layer to perform the required operations - Handle success and error responses Template-level orchestration, e.g.: 1. Instantiate a Compute host A (to host a database server) 2. Deploy MySQL on the Compute host A 3. Create a Database on MySQL by getting the input of database name, username and password, and input of artifacts of database content 4. Instantiate a Compute host B (to host the web server) 5. Deploy web server on the Compute host B 6. Deploy the web application on the web server 7. Configure the web application to connect to the Database MultiCloud Deploy and Operate with atomic resource level, minimum orchestration, e.g: - Find a cloud, if not specified, with MySQL and Apache web server capability. E.g. a K8S cluster - Prepare deployment profile, including specific capability request, and affinity rules - Deploy and configure accordingly through Kubernetes Plugin. Cloud-specific South Bound Interface, e.g. Kubernetes Plugins, etc.