Exploring Cloud Security, Operational Visibility & Elastic Datacenters Kiran Mohandas Consulting Engineer
The Ideal Goal of Network Access Policies People (Developers, Net Ops, CISO, ) V I S I O N Provide Connectivity, Security, and Manageability for: 1. People Apps 2. Apps Apps Apps (Running in multiple environments) Custom Apps M a n a g e a b i l i t y & O p e r a t i o n s S e c u r i t y P o l i c y & V i s u a l i z a t i o n C o n n e c t i v i t y CPE IP Fabric VMs Remote Branch Office Containers BMS FIREWALL Multi-site Data Center / Private Cloud (VMs, BMS, Containers, VNFs) Telco POPs Public Cloud (VPCs)
PROBLEM STATEMENT C u r r e n t B e h a v i o r D e s i r e d B e h a v i o r App1, Deployment = Dev Network Policy = P1 Use one policy and apply it to all different deployments App1, Deployment = Dev 1. Reduced complexity (fewer policies) 2. Simplified manageability (change control, etc. is much easier) 3. Improved scalability App1, Deployment = Staging App1, Deployment = Staging Policy = P Network Policy = P2 App1, Deployment = Prod Network Policy = P3 App1, Deployment = Prod
PROBLEM STATEMENT Once a set of policies are defined for a particular OpenStack environment, can it be applied to other environments? App1, Deployment = Dev 1. Reduced complexity (fewer policies) 2. Simplified manageability (change control, etc. is much easier) 3. Improved scalability 4. Define / review / approve once use everywhere App1, Deployment = Dev-AWS App1, Deployment = Staging Policy App1, Deployment = Dev-K8s App1, Deployment = Prod B a r e M e t a l S e r v e r s App1, Deployment = Staging-BMS App1, Deployment = Dev-Mesos
Redefining Policies Intent Driven Objects Policy Example: allow TCP 80 tier=web > tier=app match deployment && site Tag expression Tag expression Tag expression Policy Tags Tags / Labels Policy Enforcement Objects at different levels can be tagged Tags can be defined at different levels Global Project Network VM / Container / BMS Interface Policies will finally be enforced at the interface level
Use Case : Address Groups deny web-service any address-group=blacklist <> any action=log 66.220.144.5 205.15.20.10 10.1.1.1 20.1.1.1 10.1.1.2 30.1.1.1 20.1.1.2 End-point End-point Internet / WAN GW Address Group = blacklist CSO / Compliance / Operator can configure prefixes and individual addresses in the address group or add a label 1. 66.220.144.0/24 2. 69.63.176.0/24 3. 204.15.20.0/24 4. 30.1.1.1 5. Label = red
Simplified Security Policies for Hybrid & Multi-Cloud Config API s for Automation & Integration with other Firewall Management tools Analytics API s for integration with SIEM & other tools Policy & Analytics Framework Custom Custom Single SDN / Security deployment With connectivity and security for multiple environments
Multiple Enforcement Points Operator Web UI Apps OpenStack Other Orch. Config Analytics Policy Definition PANO, R80, SD Tags, Address Groups & FW Policies through API s CONTRAIL CONTROLLER L4 policy configurations Host Based L7 Firewall Bare Metal Server Compute with vrouter (Kernel / DPDK, vcenter) Smart NIC vrouter Public Cloud Instance Multiple Policy Enforcement points for both L4 and L7 firewalls focused on Compatibility & Performance 3 rd Party or Juniper FW vrouter enforcing L4 policies and sending traffic to L7 FW for advanced security services
Bringing Visibility into your cloud
Contrail Security Applications Flow Granular visualization of intra- and interapplication traffic flows without applying any policy
Contrail Security Policy Flow
ENFORCEMENT DEFINITION Contrail Security Key Capabilities Consistent Intent- Driven Policy Multiple Enforcement Points Application Policy Config & Flow Visualization Security Admin No Policy Rewrite Define Once Enforce Everywhere L4 L7 Web App DB Discover Inter- and Intra-application traffic flows with/without enforcing policies Single policy OpenStack Controller Site = US Host-Based FW How to extend the same set of policies to Mesos, AWS, Kubernetes, Bare Metal Servers without policy rule explosion L4 Enforcement at the vrouter (Kernel, DPDK, vcenter, Smart NIC) L7 enforcement at the L7 Firewall Offer visualization, analytics, and orchestration for security configurations Provide reporting, troubleshooting and compliance
Cloud Analytics for Building an Elastic Datacenter
Operational Challenges in Cloud DC LARGE # OF HETEROGENEOUS HARDWARE & SOFTWARE COMPONENTS Large # of heterogeneous, fragile & interconnected hardware and software components make it a challenge to run cloud at scale HUGE AMOUNTS OF MONITORING DATA FROM MULTIPLE SOURCES Multiple data sources generate large amount of data Real-time management and monitoring of large & disparate data sets requires complex data / storage management tools NO OUT-OF-THE BOX SOLUTION Legacy tools were not built for cloud-native environments Open-source based tools (Nagios, Zabbix) require significant customization and lack productiongrade reliability and scalability
SHARED INFRASTRUCTURE VM1 VM2 CHALLENGE Multi-tenant or not, many applications suffer due to NOISY NEIGHBORS HYPERVISOR Poor control over where/how workloads execute A hypervisor or container runtime may over-subscribe resources Applications may compete for shared resources resulting in unpredictable performance Orchestration schedulers don t account for all the isolation metrics Some isolation is poor or non-existent e.g. I/O throughput and latency for disk and network, caches, etc. SHARED HOST
FAILED MONITORING CHALLENGE Inefficient request-response 6 minutes Monitoring is often NOT REAL-TIME Too slow to influence orchestration Cluster management is running blind SaaS monitoring in the cloud is too far away and expensive Missing the goldilocks amount of monitoring Many big-data clusters and DBs are more complex than the apps and I&O being monitored Some monitoring is too primitive or narrow May monitor only some types of apps or infrastructure Telemetry without analysis is incomplete Infrastructure Metrics e.g. Hadoop Cluster for storing & analyzing metrics Manual or post-mortem = you lose Passive visibility and alerting, not able to fix issues Provides no actionable nor future planning insights
APPFORMIX Distributed Stream Analysis ACTIONABLE: REAL-TIME OPTIMIZATIONS Local Optimizations for shared CPU resources policy policy Data Streams RESOURCE ORCHESTRATION CAPACITY PLANNING Global Optimization for workload placement to ensure high performance & high reliability! CONTINUOS ANALYSIS OF METRICS Analyze more metrics policy policy Distributed Data Platform Signals REPORTING & ALARMING DATA LAKE NoSQL Faster prediction of failures SOLUTION SCALES WITH YOUR INFRASTRUCTURE No central choke-point! Compute & Storage Infrastructure EXTENSIBLE Use Nagios style plugins to add your own metrics
Cross Layer Visibility Single operations platform to monitor all layers of the stack APPLICATION & SERVICES CLOUD INFRASTRUCTURE SOFTWARE DEFINED INFRASTRUCTURE PHYSICAL INFRASTRUCTURE
Thank you