EPHEMERAL DEVOPS: ADVENTURES IN MANAGING SHORT-LIVED SYSTEMS

Similar documents
Best practices to achieve optimal memory allocation and remote desktop user experience

Getting Started With Containers

EVERYTHING AS CODE A Journey into IT Automation and Standardization. Raphaël Pinson

Building a Data-Friendly Platform for a Data- Driven Future

Installing or Upgrading ANM Virtual Appliance

CONTINUOUS DELIVERY WITH DC/OS AND JENKINS

Patching and Updating your VM SUSE Manager. Donald Vosburg, Sales Engineer, SUSE

SECURITY POLICY COMPLIANCE WITH PUPPET AND ANSIBLE. Sean M. Shore Best Buy MSP RHUG Dec 2017

Aspirin as a Service: Using the Cloud to Cure Security Headaches

platform Development Process Optimization For Drupal centric projects

Liquibase Version Control For Your Schema. Nathan Voxland April 3,

Linux System Management with Puppet, Gitlab, and R10k. Scott Nolin, SSEC Technical Computing 22 June 2017

Unlocking Azure with Puppet Enterprise. November 29, 2016

70-532: Developing Microsoft Azure Solutions

Administering vrealize Log Insight. September 20, 2018 vrealize Log Insight 4.7

CA Agile Central Installation Guide On-Premises release

Quick Prototyping+CI with LXC and Puppet

Transparent Service Migration to the Cloud Clone existing VMs to CloudStack/OpenStack templates without user downtime. CloudOpen Seattle 2015

VMware Site Recovery Technical Overview First Published On: Last Updated On:

Beyond 1001 Dedicated Data Service Instances

Vmware Workstation Delete Snapshot Cleaning Up Deleted Files

vmpooler pdxdevops : April 2015

VMware vfabric Data Director 2.5 EVALUATION GUIDE

CA Agile Central Administrator Guide. CA Agile Central On-Premises

Building virtual Mac environments with VMware

VMware View Upgrade Guide

DevOps Course Content

CERN: LSF and HTCondor Batch Services

(TBD GB/hour) was validated by ESG Lab

SRM 8.1 Technical Overview First Published On: Last Updated On:

SRM 6.5 Technical Overview February 26, 2018

LENS Server Maintenance Guide JZ 2017/07/28

Managing ReadyClones

What is version control? (discuss) Who has used version control? Favorite VCS? Uses of version control (read)

Online Help StruxureWare Central

Dell EMC Ready System for VDI on XC Series

Cisco Prime Service Catalog Virtual Appliance Quick Start Guide 2

Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS

Dell EMC Ready Architectures for VDI

Software Development I

Puppet 101 Basic installation for master and agent machines on Ubuntu with VMware Workstation

Dell EMC Ready System for VDI on VxRail

70-532: Developing Microsoft Azure Solutions

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

vsphere Upgrade Update 1 Modified on 4 OCT 2017 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5

A WEB-BASED SOLUTION TO VISUALIZE OPERATIONAL MONITORING LINUX CLUSTER FOR THE PROTODUNE DATA QUALITY MONITORING CLUSTER

Table of Contents DevOps Administrators

Continuous Delivery of your infrastructure. Christophe

PassTest. Bessere Qualität, bessere Dienstleistungen!

RAIFFEISENBANK BULGARIA

Using git To Manage Your System's Configuration

Developing Microsoft Azure Solutions (70-532) Syllabus

CSV-W14 - BUILDING AND ADOPTING A CLOUD-NATIVE SECURITY PROGRAM

STREAMLINING THE DELIVERY, PROTECTION AND MANAGEMENT OF VIRTUAL DESKTOPS. VMware Workstation and Fusion. A White Paper for IT Professionals

Think Small to Scale Big

Virtualizing Oracle on VMware

VMware vsphere with ESX 6 and vcenter 6

Table of Contents 1.1. Introduction. Overview of vsphere Integrated Containers 1.2

VMware - VMware vsphere: Install, Configure, Manage [V6.7]

Exam : VMWare VCP-310

P a g e 1. Teknologisk Institut. Online kursus k SysAdmin & DevOps Collection

Seven Habits of Highly Effective Jenkins Users

Developing Microsoft Azure Solutions (70-532) Syllabus

Micro Focus Desktop Containers

Redhat OpenStack 5.0 and PLUMgrid OpenStack Networking Suite 2.0 Installation Hands-on lab guide

Using DC/OS for Continuous Delivery

vsphere Replication for Disaster Recovery to Cloud vsphere Replication 8.1

IBM Cloud for VMware Solutions vrealize Automation 7.2 Chef Integration

Secure VFX in the Cloud. Microsoft Azure

Online Help StruxureWare Data Center Expert

Oracle Corporation 1

vsphere Installation and Setup Update 2 Modified on 10 JULY 2018 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5

UP! TO DOCKER PAAS. Ming

Working in Teams CS 520 Theory and Practice of Software Engineering Fall 2018

James Turnbull

Table of Contents 1.1. Overview. Containers, Docker, Registries vsphere Integrated Containers Engine

Image Management for View Desktops using Mirage

Servers & Developers. Julian Nadeau Production Engineer

SaaSaMe Transport Workload Snapshot Export for. Alibaba Cloud

Immutable Servers. Building a deployment pipeline and deploying to EC2 Spot

vsphere Replication for Disaster Recovery to Cloud vsphere Replication 6.5

VMware vrealize Operations for Horizon Administration. 20 SEP 2018 VMware vrealize Operations for Horizon 6.6

70-414: Implementing an Advanced Server Infrastructure Course 01 - Creating the Virtualization Infrastructure

Continuous integration & continuous delivery. COSC345 Software Engineering

NexentaStor VVOL

What s New in Catalogic ECX 2.5

Technical Overview. Jack Smith Sr. Solutions Architect

Docker at Lyft Speeding up development Matthew #dockercon

VMware vrealize Operations for Horizon Administration. Modified on 3 JUL 2018 VMware vrealize Operations for Horizon 6.4

Horizon DaaS Platform 6.1 Release Notes. This document describes changes to the Horizon DaaS Platform for Version 6.1.

Installation and setup guide of 1.1 demonstrator

Breaking Barriers Exploding with Possibilities

vcenter Server Installation and Setup Modified on 11 MAY 2018 VMware vsphere 6.7 vcenter Server 6.7

HySecure Quick Start Guide. HySecure 5.0

This tutorial provides a basic understanding of the infrastructure and fundamental concepts of managing an infrastructure using Chef.

DevOps Online Training

NET1821BU THE FUTURE OF NETWORKING AND SECURITY WITH NSX-T Bruce Davie CTO, APJ 2

Overhauling Dev Arch with Ansible Tower and Docker

Git, the magical version control

Transcription:

SESSION ID: CSV-W12 EPHEMERAL DEVOPS: ADVENTURES IN MANAGING SHORT-LIVED SYSTEMS Todd Carr DevOps Engineer Unity Technologies @frozenfoxx

Who am I? DevOps Engineer at Unity Technologies Security Enthusiast Enormous fan of config management Github: Keybase: Twitter: frozenfoxx frozenfoxx @frozenfoxx 2

WHAT ARE EPHEMERAL SYSTEMS

What are Ephemeral Systems? Short-lived 4

What are Ephemeral Systems? Short-lived Light, middle, or heavyweight VMs 5

What are Ephemeral Systems? Short-lived Light, middle, or heavyweight VMs Dynamically deployed 6

What are Ephemeral Systems? Short-lived Light, middle, or heavyweight VMs Dynamically deployed Dynamically configured 7

What are Ephemeral Systems? Short-lived Light, middle, or heavyweight VMs Dynamically deployed Dynamically configured Dynamically destroyed 8

What are Ephemeral Systems? Short-lived Light, middle, or heavyweight VMs Dynamically deployed Dynamically configured Dynamically destroyed Usually heterogeneous 9

What did I build? Create and destroy about 600~1,000 heavyweight virtual machines an hour Most of those run extremely CPU and disk intensive operations Updating existing and new VM configurations takes seconds Upgrades can be rolled out or rolled back in production extremely quickly Small team (three people) maintains it Bootstrapped with vsphere + Puppet 10

WHY EPHEMERAL SYSTEMS?

Why Ephemeral Systems? Multiple immediately-available VMs 12

Why Ephemeral Systems? Multiple immediately-available VMs Non-containerized applications Desktop apps Legacy apps Complex VMs 13

Why Ephemeral Systems? Multiple immediately-available VMs Non-containerized applications Desktop apps Legacy apps Complex VMs Heterogeneous target pools Multiple OSes Multiple configurations Multiple patch targets Lots of iterative testing 14

Why Ephemeral Systems? Multiple immediately-available VMs Non-containerized applications Desktop apps Legacy apps Complex VMs Heterogeneous target pools Multiple OSes Multiple configurations Multiple patch targets Lots of iterative testing Existing infrastructure New flexibility without breaking anything Doesn t require buying new hardware 15

Why Ephemeral Systems? Testing Rapid, immediate feedback with new code 16

Why Ephemeral Systems? Testing Rapid, immediate feedback with new code Experimenting Rapidly deploy on-the-fly changes 17

Why Ephemeral Systems? Testing Rapid, immediate feedback with new code Experimenting Rapidly deploy on-the-fly changes Simulating Fully leverage dynamic environment configuration management tools r10k (Puppet) grinder (Salt) 18

Why Ephemeral Systems? Testing Rapid, immediate feedback with new code Experimenting Rapidly deploy on-the-fly changes Simulating Fully leverage dynamic environment configuration management tools r10k (Puppet) grinder (Salt) Parallelization Building Testing 19

Why Ephemeral Systems? Testing Rapid, immediate feedback with new code Experimenting Rapidly deploy on-the-fly changes Simulating Fully leverage dynamic environment configuration management tools r10k (Puppet) grinder (Salt) Parallelization Building Testing Don t have budget for new data centers or administrators 20

Why Ephemeral Systems in Security? Exploit Development Write a revision, grab a target from multiple different pools of targets, destroy when done! Make a pool for every target Hook the grab, use, and destroy VM loop for every test script 21

Why Ephemeral Systems in Security? Clean Slate Experimentation Rapidly deploy on-the-fly changes Simply call the API to destroy a machine at the conclusion of every test New machines for every run No more restore from snapshot 22

Why Ephemeral Systems in Security? Dynamic Behavior Simulate changes in active installations Simply commit a change to a Hiera data file, run Puppet Need something even more dynamic? Make a Puppet Environment branch, deploy, and run the same machine against both branches No need to manually modify machines, all are still built from the same template 23

Why Ephemeral Systems in Security? Narrowed Attack Window Non-containerized applications tend to stick around a long time Complex VM requirements Non-Linux OSes Specific patch levels Custom software installations Treat these VMs as containers Create, use, destroy, loop, all via API 24

Why Ephemeral Systems in Security? Information Isolation No more wiping machines or rolling back to snapshots and hoping nothing is left on disk Grab a VM, use it, and dump it When the old one is destroyed it takes its environment with it, ensuring no disk recovery within the VM 25

TOOLS

Tools vsphere VMs 27

Tools vsphere VMs Puppet 4 (https://puppet.com/) Agent, Server, PuppetDB r10k 28

Tools vsphere VMs Puppet 4 (https://puppet.com/) Agent, Server, PuppetDB r10k VmPooler (https://github.com/puppetlabs/vmpooler) 29

Tools vsphere VMs, VM parameters Puppet 4 (https://puppet.com/) Agent, Server, PuppetDB r10k VmPooler (https://github.com/puppetlabs/vmpooler) Redis BIND ISC-DHCP-Server Dynamic DNS Updates from DHCP Server rbvmomi 30

Tools vsphere VMs, VM parameters Puppet 4 (https://puppet.com/) Agent, Server, PuppetDB r10k VmPooler (https://github.com/puppetlabs/vmpooler) Redis BIND ISC-DHCP-Server Dynamic DNS Updates from DHCP Server rbvmomi Coffee 31

BUILD

Build: Concepts Pools 33

Build: Concepts Pools Self configuration Puppet Hiera VMware GuestInfo Variables (hostname, pool, DNS, etc) 34

Build: Concepts Pools Self configuration Puppet Hiera VMware GuestInfo Variables (hostname, pool, DNS, etc) Cleanup scripts 35

Build: Flow 36

Build: Flow 37

Build: Support Puppet Autosigner (https://github.com/frozenfoxx/util/blob/master/puppet/puppetautosign) Certificate cleanup Remove old & dead node certs, reinventory https://github.com/frozenfoxx/util/blob/master/puppet/puppet-reap Nodes cleaning script Reports, facts, nodes https://github.com/frozenfoxx/util/blob/master/puppet/puppet-cleanup-nodes 38

Build: Support Puppet Autosigner (https://github.com/frozenfoxx/util/blob/master/puppet/puppetautosign) Certificate cleanup Remove old & dead node certs, reinventory https://github.com/frozenfoxx/util/blob/master/puppet/puppet-reap Nodes cleaning script Reports, facts, nodes https://github.com/frozenfoxx/util/blob/master/puppet/puppet-cleanup-nodes VmPooler Logrotate for vmpooler.log Install provided init script 39

Build: Support Puppet Autosigner (https://github.com/frozenfoxx/util/blob/master/puppet/puppet-autosign) Certificate cleanup Remove old & dead node certs, reinventory https://github.com/frozenfoxx/util/blob/master/puppet/puppet-reap Nodes cleaning script Reports, facts, nodes https://github.com/frozenfoxx/util/blob/master/puppet/puppet-cleanup-nodes VmPooler Logrotate for vmpooler.log Install provided init script vsphere Ramdisk cleaner 40

Build: Monitoring Pools empty PuppetServer, PuppetDB down Full disk Too many files in a dir to remove Certificates BIND/DHCP issues Logging can get massive Weird vsphere things Ramdisk fills up from creating/destroying VMs 41

PERFORMANCE

Performance PuppetServer holds up well 4 Cores, 16GB RAM, Linux Around 600~1,000 VMs per hour Load avg: 3.0 ~ 5.0 Creating certs, deleting certs, signing certs, compiling catalogs 43

Performance PuppetServer holds up well 4 Cores, 16GB RAM, Linux Around 600~1,000 VMs per hour Load avg: 3.0 ~ 5.0 Creating certs, deleting certs, signing certs, compiling catalogs PuppetDB hold up extremely well Not even phased by this usage, very low load 44

Performance PuppetServer holds up well 4 Cores, 16GB RAM, Linux Around 600~1,000 VMs per hour Load avg: 3.0 ~ 5.0 Creating certs, deleting certs, signing certs, compiling catalogs PuppetDB hold up extremely well Not even phased by this usage, very low load vsphere holds up okay Linked clones are instantaneous (!) vsphere VM itself may fall over, taking the API with it Needs restarting every six to nine months, YMMV 45

Performance PuppetServer holds up well 4 Cores, 16GB RAM, Linux Around 600~1,000 VMs per hour Load avg: 3.0 ~ 5.0 Creating certs, deleting certs, signing certs, compiling catalogs PuppetDB hold up extremely well Not even phased by this usage, very low load vsphere holds up okay Linked clones are instantaneous (!) vsphere VM itself may fall over, taking the API with it Needs restarting every six to nine months, YMMV DHCP/BIND holds up okay...mostly Once a year or so stops adding/removing, just restart 46

USAGE

Usage: General Get a box curl -d --url vmpooler.somewhere.com:4567/api/v1/vm/[vm-type] Checks out a box, [box hostname] Use that box All done? Dump the box curl -X DELETE --url vmpooler.somewhere.com:4567/api/v1/vm/[box hostname] Loop 48

Usage: Parallel Testing Batches Array of tests Get boxes Loop over retrieval for array of boxes curl -d --url vmpooler.somewhere.com:4567/api/v1/vm/[vm-type] Run block of tests against array of boxes All done? Dump the boxes Loop over array of boxes curl -X DELETE --url vmpooler.somewhere.com:4567/api/v1/vm/[box hostname] Loop 49

Usage: Dynamic Environments New Puppet branch, need to test Get a box curl -d --url vmpooler.somewhere.com:4567/api/v1/vm/[vm-type] Checks out a box, [box hostname] SSH to that box Let s config that box Normal mode: puppet agent --test New feature: puppet agent --test --environment [featurebranch] All done? Dump the box curl -X DELETE --url vmpooler.somewhere.com:4567/api/v1/vm/[box hostname] Loop Merge Puppet branch 50

Usage: Dynamic App Behavior Make a new Puppet environment, [newbehavior] Users, configs, whatever needs to be simulated in Hiera and Manifests Deploy with r10k Get a box curl -d --url vmpooler.somewhere.com:4567/api/v1/vm/[vm-type] SSH to that box, alter the app behavior Normal behavior: puppet agent test New behavior: puppet agent --test --environment [newbehavior] Test All done? Dump the box and branch curl -X DELETE --url vmpooler.somewhere.com:4567/api/v1/vm/[box hostname] git push origin :[newbehavior] 51

MAINTENANCE

Maintenance These examples are using Puppet These sorts of concerns will affect ANY tool doing config management Salt, Chef, CFengine, Puppet, all have the same concerns They all expect nodes to live a long time Maintenance is...different Ephemeral VMs die all the time, that s okay If any component dies, the pools drain Drained pools are bad Bad pools are sad pools 53

Maintenance vmpooler.log Size, rotation, needs monitoring 54

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup 55

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation 56

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup 57

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning 58

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning PuppetServer JVM/CPU allocation 59

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning PuppetServer JVM/CPU allocation PuppetDB JVM/CPU allocation 60

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning PuppetServer JVM/CPU allocation PuppetDB JVM/CPU allocation PuppetDB database upgrades/migration 61

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning PuppetServer JVM/CPU allocation PuppetDB JVM/CPU allocation PuppetDB database upgrades/migration Losing PuppetServer/PuppetDB certificates (means rebuild) 62

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning PuppetServer JVM/CPU allocation PuppetDB JVM/CPU allocation PuppetDB database upgrades/migration Losing PuppetServer/PuppetDB certificates (means rebuild) Disk filling up 63

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning PuppetServer JVM/CPU allocation PuppetDB JVM/CPU allocation PuppetDB database upgrades/migration Losing PuppetServer/PuppetDB certificates (means rebuild) Disk filling up Failing to self-cleanup (too many files) 64

Maintenance vmpooler.log Size, rotation, needs monitoring Puppet Cleanup Certificate allocation Certificate cleanup Certificate autosigning PuppetServer JVM/CPU allocation PuppetDB JVM/CPU allocation PuppetDB database upgrades/migration Losing PuppetServer/PuppetDB certificates (means rebuild) Disk filling up Failing to self-cleanup (too many files) vsphere stops responding 65

EXTENDING

Extending TerraForm + Packer TerraForm for management hosts Packer for management hosts & Ephemeral VMs 67

Extending TerraForm + Packer TerraForm for management hosts Packer for management hosts & Ephemeral VMs ChatOps Calls to VmPooler API 68

Extending TerraForm + Packer TerraForm for management hosts Packer for management hosts & Ephemeral VMs ChatOps Calls to VmPooler API Containers for management components VmPooler has a container, but it includes Redis (heavy) Redis containers exist Puppet containers aren t 100% supported (but work!) 69

Extending TerraForm + Packer TerraForm for management hosts Packer for management hosts&ephemeral VMs ChatOps Calls to VmPooler API Containers for management components VmPooler has a container, but it includes Redis (heavy) Redis containers exist Puppet containers aren t 100% supported (but work!) Removing PuppetDB If you aren t using the data or collections, it can only fail here Lose speed on compilation, YMMV 70

Extending One more wild idea 71

Extending One more wild idea Remove PuppetServer from Ephemeral VM loop Go full standalone Use puppet apply to self-configure Use Packer scripts to prebuild only parts from Hiera and Codebase relevant to an Ephemeral VM type Lose flexibility for testing quickly Gain reliability on the server side No more certificate cleanup 72

SUMMARY

Summary VmPooler is awesome! 74

Summary VmPooler is awesome! Dynamic Environments are awesome! 75

Summary VmPooler is awesome! Dynamic Environments are awesome! Puppet is awesome! 76

Summary VmPooler is awesome! Dynamic Environments are awesome! Puppet is awesome! Dynamic DNS + DHCP is awesome! 77

Summary VmPooler is awesome! Dynamic Environments are awesome! Puppet is awesome! Dynamic DNS + DHCP is awesome! Dynamic pools are awesome! 78

Summary VmPooler is awesome! Dynamic Environments are awesome! Puppet is awesome! Dynamic DNS + DHCP is awesome! Dynamic pools are awesome! Everything is awesome! 79

Apply What You Have Learned Today Deploy toolchain VMs Vmpooler, DHCP + Bind, PuppetServer + PuppetDB Reconfigure BIND for Dynamic DNS Updates Create pool templates OSes, patch levels, installed software, desired targets Experiment! Exploit Development Clean Slate Experimentation Dynamic Behavior Narrowed Attack Windows Information Isolation (Usage: General Parallel Testing Batches) (Usage: General Dynamic Environment) (Usage: Dynamic App Behavior) (Usage: General Dynamic App Behavior) (Usage: General Dynamic Environment) 80

QUESTIONS