@unterstein @dcos @bedcon #bedcon Operating microservices with Apache Mesos and DC/OS 1
Johannes Unterstein Software Engineer @Mesosphere @unterstein @unterstein.mesosphere 2017 Mesosphere, Inc. All Rights Reserved. 2
In the beginning there was a big Monolith 3
COMPUTERS Application Operating System Hardware 4
INTERNET - Remote Users! Web Application Operating System Hardware 5
DISTRIBUTION - Horizontal Scale Fault Tolerance Availability Load Balancing Web App Web App Web App Operating System Operating System Operating System Hardware Hardware Hardware 6
SERVICEORIENTED ARCHITECTURE - Separation of concerns - Optimization of bottlenecks - Smaller teams - API Contracts - Data replication - Complicated provisioning - Dependency management Web App Web App Web App Operating System Operating System Operating System Hardware Hardware Hardware 7
HARDWARE VIRTUALIZATION - Fast provisioning Isolation Portability Utilization Configuration Management - Virtual Networking - Credential management Web App Web App Web App Operating System Operating System Operating System Machine Machine Machine Infrastructure 8
MICROSERVICES - Polyglot Single Responsibility Smaller Teams Utilization Machine types/groups - Dependency hell App App App Operating System Operating System Operating System Machine Machine Machine Infrastructure 9
OVERVIEW noun ˈmīkrō/ /ˈsərvəs/ : an approach to application development in which a large application is built as a suite of modular services. Each module supports a specific business goal and uses a simple, well-defined interface to communicate with other modules.* Microservices are designed to be flexible, resilient, efficient, robust, and individually scalable. *From whatis.com
MICROSERVICES - Polyglot Single Responsibility Smaller Teams Utilization Machine types/groups - Dependency hell App App App Operating System Operating System Operating System Machine Machine Machine Infrastructure 11
Run everything in containers!
CONTAINERS - Rapid deployment - Dependency vendoring - Container image repositories - Spreadsheet scheduling App App App Container Runtime Container Runtime Container Runtime OS OS OS Machine Machine Machine Infrastructure 13
CONTAINER SCHEDULING 14
RESOURCE MANAGEMENT 15
SERVICE MANAGEMENT 16
CONTAINER ORCHESTRATION CONTAINER SCHEDULING RESOURCE MANAGEMENT SERVICE MANAGEMENT - Load Balancing Readiness Checking 17
CONTAINER ORCHESTRATION CONTAINER SCHEDULING - Placement Replication/Scaling Resurrection Rescheduling Rolling Deployment Upgrades Downgrades Collocation RESOURCE MANAGEMENT - Memory CPU GPU Volumes Ports IPs Images/Artifacts SERVICE MANAGEMENT - Labels Groups/Namespaces Dependencies Load Balancing Readiness Checking 18
CONTAINER ORCHESTRATION Web Apps & s Orchestration Management Scheduling Resource Management Container Runtime Container Runtime Container Runtime Machine & OS Machine & OS Machine & OS Machine Infrastructure 19
Meanwhile... MapReduce is crunching Data 20
DATA PROCESSING AT HYPERSCALE EVENTS INGEST STORE ANALYZE ACT Ubiquitous data streams from connected devices Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data and build data driven applications Apache Spark Akka Sensors Devices Clients Apache Kafka Apache Cassandra DC/OS 2017 Mesosphere, Inc. All Rights Reserved. 21
DATA PROCESSING AT HYPERSCALE EVENTS INGEST STORE ANALYZE ACT Ubiquitous data streams from connected devices Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data and build data driven applications Apache Spark Akka Sensors Devices Clients Apache Kafka Apache Cassandra DC/OS 2017 Mesosphere, Inc. All Rights Reserved. 22
STREAM PROCESSING Apache Storm Apache Spark Apache Samza Apache Flink Apache Apex Concord cloud-only: AWS Kinesis, Google Cloud Dataflow 2017 Mesosphere, Inc. All Rights Reserved. 23
EXECUTION MODEL Micro-Batching Native Streaming 2017 Mesosphere, Inc. All Rights Reserved. 24
DATA PROCESSING AT HYPERSCALE EVENTS INGEST STORE ANALYZE ACT Ubiquitous data streams from connected devices Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data and build data driven applications Apache Spark Akka Sensors Devices Clients Apache Kafka Apache Cassandra DC/OS 2017 Mesosphere, Inc. All Rights Reserved. 25
Datastores 2017 Mesosphere, Inc. All Rights Reserved. 26
Data Model Relational Key-Value Schema SQL Foreign Keys/Joins OLTP/OLAP Simple Scalable Cache Graph Complex relations Social Graph Recommend ation Fraud detections Document Time-Series Files Schema-Less Semi-structu red queries Product catalogue Session data 2017 Mesosphere, Inc. All Rights Reserved. 27
Modern datacenter 28
Flink Cassandra µs Spark RDB
Flink Cassandra µs Spark RDB
KEEP IT STATIC A naive approach to handling varied app requirements: static partitioning. Maintaining sufficient headroom to handle peak workloads on all partitions leads to poor utilization overall. time 31
SHARED RESOURCES Multiple frameworks can use the same cluster resources, with their share adjusting dynamically. time 32
SILOS OF DATA, SERVICES, USERS, ENVIRONMENTS Spark/Hadoop Kafka mysql Multiplexing 30-40% utilization, more at some users microservices Industry Average 12-15% utilization Cassandra Typical Datacenter siloed, over-provisioned servers, low utilization 4X Modern Datacenter automated schedulers, workload multiplexing onto the same machines 33
34
THE BIRTH OF MESOS TWITTER TECH TALK APACHE INCUBATION The grad students working on Mesos give a tech talk at Twitter. Spring 2009 Mesos enters the Apache Incubator. September 2010 March 2010 April 2016 December 2010 CS262B MESOS PUBLISHED Ben Hindman, Andy Konwinski and Matei Zaharia create Nexus as their CS262B class project. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center is published as a technical report. DC/OS 35
Apache Mesos A top-level Apache project A cluster resource negotiator Scalable to 10,000s of nodes Fault-tolerant, battle-tested An SDK for distributed apps Native Docker support 36
MESOS FUNDAMENTALS ARCHITECTURE 37
STORAGE OPTIONS Default Sandbox Simple to use, Task failures Persistent Volumes Task failures, (permanent) Node failures Distributed File System/External Storage Node failures, non-local writes 38
2017 Mesosphere, Inc. All Rights Reserved. 39
DC/OS ENABLES MODERN DISTRIBUTED APPS Modern App Components Microservices (in containers) Big Data + Analytics Engines Streaming Batch Functions & Logic Analytics Machine Learning Search Time Series Databases SQL / NoSQL Datacenter Operating System (DC/OS) Distributed Systems Kernel (Mesos) Any Infrastructure (Physical, Virtual, Cloud) 40
THE BASICS DC/OS is 100% open source (ASL2.0) + A big, diverse community An umbrella for ~30 OSS projects + Roadmap and designs + Docs, tutorials, setup installations.. + Check https://dcos.io Familiar, with more features + Networking, Security, CLI, UI, Discovery, Load Balancing, Packages,... 41
Best Practices 42
GOING TO PRODUCTION Deployment Discovery Monitoring Logging 43
BEST PRACTICES DEPLOYMENT Version configurations Private registries Resource limits Make it HA 44
BEST PRACTICES SERVICE DISCOVERY Dynamic ports Virtual IPs DNS Overlay networks External tools 45
BEST PRACTICES MONITORING & LOGGING Application metrics Health checks Alerting Aggregate logs Consistent service logs 46
Questions? Code: https://git.io/v1djv @unterstein https://git.io/vxuoy github.com/unterstein @mesosphere / mesosphere.com @dcos / dcos.io / chat.dcos.io 2017 Mesosphere, Inc. All Rights Reserved. 47