@joerg_schad Nightmares of a Container Orchestration System

Size: px
Start display at page:

Download "@joerg_schad Nightmares of a Container Orchestration System"

Transcription

1 @joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1

2 Jörg Schad Distributed Systems Jan Repnak Support Engineer/ Solution 2017 Mesosphere, Inc. All Rights Reserved. 2

3 3AM 2017 Mesosphere, Inc. All Rights Reserved. 3

4 Scope of Jan s nightmares Modern App Components Big Data + Analytics Microservices (in containers) Engines Functions & Logic Streaming Batch Machine Learning Search Time Series SQL / NoSQL Analytics Databases Datacenter Operating System (DC/OS) Distributed Systems Kernel (Apache Mesos) Any Infrastructure (Physical, Virtual, Cloud) 2017 Mesosphere, Inc. All Rights Reserved. 4

5 Backup State Why should we have backup? Backup state Services Cluster 2017 Mesosphere, Inc. All Rights Reserved. 5

6 Immutable Container Images Use tagged container images Keep tagged images immutable! 2017 Mesosphere, Inc. All Rights Reserved. 6

7 Private Container Registries Dockerhub works great for our test cluster Use tagged container images Keep tagged images immutable! Use a private container registry! 2017 Mesosphere, Inc. All Rights Reserved. 7

8 Repeatable Container Builds `docker commit` is great* Use repeatable builds for images Including FROM clause Keep images minimal Multistage build From scratch 2017 Mesosphere, Inc. All Rights Reserved. 8

9 UI Deployments Use (Marathon) endpoints for deployments! Version (and track) your app definitions! 2017 Mesosphere, Inc. All Rights Reserved. 9

10 Disk Usage All our disk are full Docker and logs are great in filling up disk space! Images Container Cleanup docker instances and images! docker prune Monitor available disk space! 2017 Mesosphere, Inc. All Rights Reserved. 10

11 Resource Constraints 32MB are sufficient for Docker container Memory constraints are hard limits Consider overhead (e.g., Java) Difficult to approximate Monitor 2017 Mesosphere, Inc. All Rights Reserved. 11

12 Zookeeper Cluster Size Our Zookeeper Cluster has 4 nodes, that is better than 3, or? Zookeeper quorum (i.e., #Masters) should be odd! Production 5 is optimal! 2017 Mesosphere, Inc. All Rights Reserved. 12

13 Health Checks What are health checks? Specify Health checks carefully Different options Mesos vs Marathon, Command vs HTTP Impacts Load-Balancers and restarts Readiness checks 2017 Mesosphere, Inc. All Rights Reserved. 13

14 NoSQL Datastores We replaced out Postgres instance with Cassandra, and now we get stale results Consider the semantics of your datastore! ACID vs Base Model your data and queries accordingly! 2017 Mesosphere, Inc. All Rights Reserved. 14

15 Removing Stateful Frameworks Follow the uninstall instructions! Reservations and Zookeeper state! state.json 2017 Mesosphere, Inc. All Rights Reserved. 15

16 Container vs VMs We just replaced all our VM instances by containers* Be aware of different isolation semantics! 2017 Mesosphere, Inc. All Rights Reserved. 16

17 Write Once Run Any Where The (Java) container was running fine in testing Java (<9) not groups aware # threads for GC Set default values carefully 2017 Mesosphere, Inc. All Rights Reserved. 17

18 Mesos Modules To solve this problem, our team quickly developed this really cool Mesos Module Mesos Modules can be tricky! Monitoring and Debugging 2017 Mesosphere, Inc. All Rights Reserved. 18

19 Linux Distributions We are using *obscure Linux distribution* for our (DC/ OS) cluster If possible use tested distributions! Especially for DC/OS! 2017 Mesosphere, Inc. All Rights Reserved. 19

20 Services on the same node We are running *distributed Database* outside Mesos on the same cluster Be careful when running services outside Mesos but on the same cluster! Adjust resources accordingly! 2017 Mesosphere, Inc. All Rights Reserved. 20

21 Spreading out Master Nodes We are running our cluster across different AWS regions.. Be careful when distributing Master nodes across high latency links! Different AWS AZ ok, different region probably not! 2017 Mesosphere, Inc. All Rights Reserved. 21

22 Agent Attributes attributes='rack:abc;zone:west; os:centos5;level:10;keys:[ ]' We changed the agent attributes for running cluster Set agent attributes when starting an agent! Do not change for running agents! 2017 Mesosphere, Inc. All Rights Reserved. 22

23 Cluster Upgrades We upgraded our cluster * Check state before Follow upgrade instructions! Automation Remember Backup! 2017 Mesosphere, Inc. All Rights Reserved. 23

24 UPGRADE PROCEDURE Before upgrading 1. Make sure cluster is healthy! 2. Perform backup a. ZK b. Replicated logs c. other state 3. Review release notes 4. Generate install bundle a. Validate versions Agent Executor Task Framework Scheduler LEADER STANDBY STANDBY Agent Executor Task ZK ZK ZK 2017 Mesosphere, Inc. All Rights Reserved. 24

25 UPGRADE PROCEDURE 1. Master rolling upgrade a. Start with standby b. Uninstall DC/OS c. Install new DC/OS 2. Agent rolling upgrade 3. Framework upgrades Agent Framework Scheduler ZK ZK LEADER STANDBY STANDBY Agent ZK Executor Task Executor Task 2017 Mesosphere, Inc. All Rights Reserved. 25

26 UPGRADE PROCEDURE Framework ZK 1. Master rolling upgrade 2. Agent rolling upgrade a. Uninstall DC/OS b. Install new DC/OS 3. Framework upgrades Scheduler ZK LEADER STANDBY STANDBY ZK Agent Executor Task Agent Executor Task 2017 Mesosphere, Inc. All Rights Reserved. 26

27 UPGRADE PROCEDURE Framework ZK 1. Master rolling upgrade 2. Agent rolling upgrade 3. Framework upgrades a. Orthogonal to DC/OS b. Ensure changes don t affect existing apps Agent Executor Task Scheduler LEADER STANDBY STANDBY Agent Executor Task ZK ZK 2017 Mesosphere, Inc. All Rights Reserved. 27

28 Software Upgrades We have automatic updates enabled for Docker Follow upgrade instructions! Backup! Explicit control of versions! 2017 Mesosphere, Inc. All Rights Reserved. 28

29 Day 2 Operations Our POC app is deployed in our production environment, time for vacation Keep it running! Day 2 Operations is the actually challenging part! 2017 Mesosphere, Inc. All Rights Reserved. 29

30 Day 2 Operations Configuration Updates (ex: Scaling, reconfiguration) Binary Upgrades Cluster Maintenance (ex: Backup, Restore, Restart) Monitor progress of operations Debug any runtime blockages 2017 Mesosphere, Inc. All Rights Reserved. 30

31 METRICS Measurements captured to determine health and performance of cluster - How utilized is the cluster? - Are resources being optimally used? - Is the system performing better or worse over time? - Are there bottlenecks in the system? - What is the response time of applications? 2017 Mesosphere, Inc. All Rights Reserved. 31

32 DC/OS METRIC SOURCES Mesos metrics Resource, frameworks, masters, agents, tasks, system, events Container Metrics CPU, mem, disk, network Application Metrics QPS, latency, response time, hits, active users, errors App App App Container Container Container Mesos OS 2017 Mesosphere, Inc. All Rights Reserved. 32

33 Production Checklist 2017 Mesosphere, Inc. All Rights Reserved. 33

34 MESOS CHECKLIST Monitor both Masters and Agents for flapping (i.e., continuously restarting). This can be accomplished by using the `uptime` metric. Monitor the rate of changes in terminal task states, including TASK_FAILED, TASK_LOST, and TASK_KILLED 2017 Mesosphere, Inc. All Rights Reserved. 34

35 MESOS MASTER CHECKLIST Use five master instances in production. Three is sufficient for HA in staging/test Place masters on separate racks, if possible Secure the teardown endpoints to prevent accidental framework removal Mesosphere, Inc. All Rights Reserved. 35

36 MESOS AGENT CHECKLIST Set agent attributes before you run anything on the cluster. Once an agent is started, changing the attributes may break recovery of running tasks in the event of a restart. See also issues.apache.org/jira/browse/mesos Explicitly set the resources on the nodes to leave capacity for other services running there outside of Mesos control. For example, HDFS processes running alongside Mesos Mesosphere, Inc. All Rights Reserved. 36

37 ZOOKEEPER CHECKLIST Run with security and ACLs, see the `--zk=` and `--master=` flags on the master and slaves respectively. If you do enable ACLs, they must be enabled before nodes are created in ZK. Backup ZooKeeper snapshots and log at regular intervals. - Guano or zkconfig.py (Want Snapshots + Transaction Log) Marathon, Chronos, and other frameworks store state in ZK. The first Marathon should store state in the same ZK as Mesos master. Userland apps should NOT store state in the ZK cluster shared by Mesos and Marathon. Examples of userland apps include Storm, service discovery tools, and additional instances of Marathon and Chronos Mesosphere, Inc. All Rights Reserved. 37

38 ZOOKEEPER CHECKLIST Monitor ZK's JVM metrics, such as heap usage, GC pause times, and full-collection frequency. Monitor ZK for: number of client connections, total number of znodes, size of znodes (min, max, avg, 99% percentile), and read/write performance metrics 2017 Mesosphere, Inc. All Rights Reserved. 38

39 ANY chat.dcos.io /groups/ /dcos /dcos/examples /dcos/demos 2017 Mesosphere, Inc. All Rights Reserved.

Container 2.0. Container: check! But what about persistent data, big data or fast data?!

Container 2.0. Container: check! But what about persistent data, big data or fast data?! @unterstein @joerg_schad @dcos @jaxdevops Container 2.0 Container: check! But what about persistent data, big data or fast data?! 1 Jörg Schad Distributed Systems Engineer @joerg_schad Johannes Unterstein

More information

Scale your Docker containers with Mesos

Scale your Docker containers with Mesos Scale your Docker containers with Mesos Timothy Chen tim@mesosphere.io About me: - Distributed Systems Architect @ Mesosphere - Lead Containerization engineering - Apache Mesos, Drill PMC / Committer

More information

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS APACHE MESOS NYC MEETUP SEPTEMBER 22, 2016 CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS WHO WE ARE ROGER IGNAZIO SUNIL SHAH Tech Lead at Mesosphere @rogerignazio Product Manager at Mesosphere @ssk2

More information

利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC

利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC 利用 Mesos 打造高延展性 Container 環境 Frank, Microsoft MTC About Me Developer @ Yahoo! DevOps @ HTC Technical Architect @ MSFT Agenda About Docker Manage containers Apache Mesos Mesosphere DC/OS application = application

More information

CONTINUOUS DELIVERY WITH DC/OS AND JENKINS

CONTINUOUS DELIVERY WITH DC/OS AND JENKINS SOFTWARE ARCHITECTURE NOVEMBER 15, 2016 CONTINUOUS DELIVERY WITH DC/OS AND JENKINS AGENDA Presentation Introduction to Apache Mesos and DC/OS Components that make up modern infrastructure Running Jenkins

More information

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS @unterstein @dcos @bedcon #bedcon Operating microservices with Apache Mesos and DC/OS 1 Johannes Unterstein Software Engineer @Mesosphere @unterstein @unterstein.mesosphere 2017 Mesosphere, Inc. All Rights

More information

Building a Data-Friendly Platform for a Data- Driven Future

Building a Data-Friendly Platform for a Data- Driven Future Building a Data-Friendly Platform for a Data- Driven Future Benjamin Hindman - @benh 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami BENJAMIN HINDMAN Co-founder and Chief Architect of Mesosphere,

More information

Building/Running Distributed Systems with Apache Mesos

Building/Running Distributed Systems with Apache Mesos Building/Running Distributed Systems with Apache Mesos Philly ETE April 8, 2015 Benjamin Hindman @benh $ whoami 2007-2012 2009-2010 - 2014 my other computer is a datacenter my other computer is a datacenter

More information

Using DC/OS for Continuous Delivery

Using DC/OS for Continuous Delivery Using DC/OS for Continuous Delivery DevPulseCon 2017 Elizabeth K. Joseph, @pleia2 Mesosphere 1 Elizabeth K. Joseph, Developer Advocate, Mesosphere 15+ years working in open source communities 10+ years

More information

Mesosphere and Percona Server for MongoDB. Jeff Sandstrom, Product Manager (Percona) Ravi Yadav, Tech. Partnerships Lead (Mesosphere)

Mesosphere and Percona Server for MongoDB. Jeff Sandstrom, Product Manager (Percona) Ravi Yadav, Tech. Partnerships Lead (Mesosphere) Mesosphere and Percona Server for MongoDB Jeff Sandstrom, Product Manager (Percona) Ravi Yadav, Tech. Partnerships Lead (Mesosphere) Mesosphere DC/OS MICROSERVICES, CONTAINERS, & DEV TOOLS DATA SERVICES,

More information

Mesosphere and Percona Server for MongoDB. Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere)

Mesosphere and Percona Server for MongoDB. Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere) Mesosphere and Percona Server for MongoDB Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere) Mesosphere DC/OS MICROSERVICES, CONTAINERS, & DEV TOOLS

More information

SCALING LIKE TWITTER WITH APACHE MESOS

SCALING LIKE TWITTER WITH APACHE MESOS Philip Norman & Sunil Shah SCALING LIKE TWITTER WITH APACHE MESOS 1 MODERN INFRASTRUCTURE Dan the Datacenter Operator Alice the Application Developer Doesn t sleep very well Loves automation Wants to control

More information

Advantages of using DC/OS Azure infrastructure and the implementation architecture Bill of materials used to construct DC/OS and the ACS clusters

Advantages of using DC/OS Azure infrastructure and the implementation architecture Bill of materials used to construct DC/OS and the ACS clusters Reference implementation: The Azure Container Service DC/OS is a distributed operating system powered by Apache Mesos that treats collections of CPUs, RAM, networking and so on as a distributed kernel

More information

Deploying Applications on DC/OS

Deploying Applications on DC/OS Mesosphere Datacenter Operating System Deploying Applications on DC/OS Keith McClellan - Technical Lead, Federal Programs keith.mcclellan@mesosphere.com V6 THE FUTURE IS ALREADY HERE IT S JUST NOT EVENLY

More information

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017. Dublin Apache Kafka Meetup, 30 August 2017 The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Joseph @pleia2 * ASF projects 1 Elizabeth K. Joseph, Developer Advocate Developer Advocate

More information

The Emergence of the Datacenter Developer. Tobi Knaup, Co-Founder & CTO at

The Emergence of the Datacenter Developer. Tobi Knaup, Co-Founder & CTO at The Emergence of the Datacenter Developer Tobi Knaup, Co-Founder & CTO at Mesosphere @superguenter A Brief History of Operating Systems 2 1950 s Mainframes Punchcards No operating systems Time Sharing

More information

Distributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere

Distributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere Distributed Data on Distributed Infrastructure Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere Kunal Kusoorkar Director Solutions Engineering, ArangoDB @neunhoef Jörg Schad Claudius

More information

Spark and Flink running scalable in Kubernetes Frank Conrad

Spark and Flink running scalable in Kubernetes Frank Conrad Spark and Flink running scalable in Kubernetes Frank Conrad Architect @ apomaya.com scalable efficient low latency processing 1 motivation, use case run (external, unknown trust) customer spark / flink

More information

AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS

AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS Sunil Shah AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS 1 THE DATACENTER OPERATING SYSTEM (DCOS) 2 DCOS INTRODUCTION The Mesosphere Datacenter Operating System (DCOS) is a distributed operating

More information

Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS

Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS ContainerCon @ Open Source Summit North America 2017 Elizabeth K. Joseph @pleia2 1 Elizabeth K. Joseph, Developer Advocate

More information

Windows Azure Services - At Different Levels

Windows Azure Services - At Different Levels Windows Azure Windows Azure Services - At Different Levels SaaS eg : MS Office 365 Paas eg : Azure SQL Database, Azure websites, Azure Content Delivery Network (CDN), Azure BizTalk Services, and Azure

More information

Issues Fixed in DC/OS

Issues Fixed in DC/OS Release Notes for 1.10.4 These are the release notes for DC/OS 1.10.4. DOWNLOAD DC/OS OPEN SOURCE Issues Fixed in DC/OS 1.10.4 CORE-1375 - Docker executor does not hang due to lost messages. DOCS-2169

More information

Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS Mesosphere, Inc. All Rights Reserved.

Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS Mesosphere, Inc. All Rights Reserved. Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS 1 Introduction MOBILE, SOCIAL & CLOUD ARE RAISING CUSTOMER EXPECTATIONS We need a way to deliver software so fast that our

More information

Mesosphere and the Enterprise: Run Your Applications on Apache Mesos. Steve Wong Open Source Engineer {code} by Dell

Mesosphere and the Enterprise: Run Your Applications on Apache Mesos. Steve Wong Open Source Engineer {code} by Dell Mesosphere and the Enterprise: Run Your Applications on Apache Mesos Steve Wong Open Source Engineer {code} by Dell EMC @cantbewong Open source at Dell EMC {code} by Dell EMC is a group of passionate open

More information

UPGRADING A MESOS CLUSTER

UPGRADING A MESOS CLUSTER MesosCon 2016 - Greg Mann UPGRADING A MESOS CLUSTER 2016 Mesosphere, Inc. All Rights Reserved. 1 Greg Mann Software Engineer Mesos contributor Computational chemist Croissant enthusiast @greggomann 2016

More information

MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET

MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET Roger Ignazio PuppetConf 2015 MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET 2015 Mesosphere, Inc. All Rights Reserved. 1 $(whoami) ABOUT ME Roger Ignazio Infrastructure Automation Engineer @ Mesosphere

More information

Building A Better Test Platform:

Building A Better Test Platform: Building A Better Test Platform: A Case Study of Improving Apache HBase Testing with Docker Aleks Shulman, Dima Spivak Outline About Cloudera Apache HBase Overview API compatibility API compatibility testing

More information

Introduction to Mesos and the Datacenter Operating System

Introduction to Mesos and the Datacenter Operating System Introduction to Mesos and the Datacenter Operating System Artem Harutyunyan (artem@mesosphere.io) 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami ARTEM HARUTYUNYAN ALICE Offline (2004-2010) AliEn

More information

EsgynDB Enterprise 2.0 Platform Reference Architecture

EsgynDB Enterprise 2.0 Platform Reference Architecture EsgynDB Enterprise 2.0 Platform Reference Architecture This document outlines a Platform Reference Architecture for EsgynDB Enterprise, built on Apache Trafodion (Incubating) implementation with licensed

More information

SAMPLE CHAPTER IN ACTION. Roger Ignazio. FOREWORD BY Florian Leibert MANNING

SAMPLE CHAPTER IN ACTION. Roger Ignazio. FOREWORD BY Florian Leibert MANNING SAMPLE CHAPTER IN ACTION Roger Ignazio FOREWORD BY Florian Leibert MANNING Mesos in Action by Roger Ignazio Chapter 1 Copyright 2016 Manning Publications brief contents PART 1 HELLO, MESOS...1 1 Introducing

More information

Scalable Streaming Analytics

Scalable Streaming Analytics Scalable Streaming Analytics KARTHIK RAMASAMY @karthikz TALK OUTLINE BEGIN I! II ( III b Overview Storm Overview Storm Internals IV Z V K Heron Operational Experiences END WHAT IS ANALYTICS? according

More information

Exam : Implementing Microsoft Azure Infrastructure Solutions

Exam : Implementing Microsoft Azure Infrastructure Solutions Exam 70-533: Implementing Microsoft Azure Infrastructure Solutions Objective Domain Note: This document shows tracked changes that are effective as of January 18, 2018. Design and Implement Azure App Service

More information

Microservices at Netflix Scale. First Principles, Tradeoffs, Lessons Learned Ruslan

Microservices at Netflix Scale. First Principles, Tradeoffs, Lessons Learned Ruslan Microservices at Netflix Scale First Principles, Tradeoffs, Lessons Learned Ruslan Meshenberg @rusmeshenberg Microservices: all benefits, no costs? Netflix is the world s leading Internet television network

More information

Beyond 1001 Dedicated Data Service Instances

Beyond 1001 Dedicated Data Service Instances Beyond 1001 Dedicated Data Service Instances Introduction The Challenge Given: Application platform based on Cloud Foundry to serve thousands of apps Application Runtime Many platform users - who don

More information

Building and Running a Solr-as-a-Service SHAI ERERA IBM

Building and Running a Solr-as-a-Service SHAI ERERA IBM Building and Running a Solr-as-a-Service SHAI ERERA IBM Who Am I? Working at IBM Social Analytics & Technologies Lucene/Solr committer and PMC member http://shaierera.blogspot.com shaie@apache.org Background

More information

A Whirlwind Tour of Apache Mesos

A Whirlwind Tour of Apache Mesos A Whirlwind Tour of Apache Mesos About Herdy Senior Software Engineer at Citadel Technology Solutions (Singapore) The eternal student Find me on the internet: _hhandoko hhandoko hhandoko https://au.linkedin.com/in/herdyhandoko

More information

Note: Isolation guarantees among subnets depend on your firewall policies.

Note: Isolation guarantees among subnets depend on your firewall policies. Virtual Networks DC/OS supports Container Networking Interface (CNI)-compatible virtual networking solutions, including Calico and Contrail. DC/OS also provides a native virtual networking solution called

More information

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services Deep Dive Amazon Kinesis Ian Meyers, Principal Solution Architect - Amazon Web Services Analytics Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure

More information

DC/OS Metrics. (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise. Nick Parker at..

DC/OS Metrics. (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise. Nick Parker at.. DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise Nick Parker at.. 1 Introduction Nick Parker DC/OS Slack: chat.dcos.io DC/OS Mailing List: users@dcos.io

More information

Best Practices for Developing & Deploying Java Applications with Docker

Best Practices for Developing & Deploying Java Applications with Docker JavaOne 2017 CON7957 Best Practices for Developing & Deploying Java Applications with Docker Eric Smalling - Solution Architect, Docker Inc. @ericsmalling Who Am I? Eric Smalling Solution Architect Docker

More information

Container Orchestration on Amazon Web Services. Arun

Container Orchestration on Amazon Web Services. Arun Container Orchestration on Amazon Web Services Arun Gupta, @arungupta Docker Workflow Development using Docker Docker Community Edition Docker for Mac/Windows/Linux Monthly edge and quarterly stable

More information

InterSystems Cloud Manager & Containers for InterSystems Technologies. Luca Ravazzolo Product Manager

InterSystems Cloud Manager & Containers for InterSystems Technologies. Luca Ravazzolo Product Manager InterSystems Cloud Manager & Containers for InterSystems Technologies Luca Ravazzolo Product Manager InterSystems Cloud Manager 1. What is it? 2. How does it work & How do I use it? 3. Why is it interesting?

More information

Kontejneri u Azureu uz pomoć Kubernetesa što i kako? Tomislav Tipurić Partner Technology Strategist Microsoft

Kontejneri u Azureu uz pomoć Kubernetesa što i kako? Tomislav Tipurić Partner Technology Strategist Microsoft Kontejneri u Azureu uz pomoć Kubernetesa što i kako? Tomislav Tipurić Partner Technology Strategist Microsoft Source: Softpedia Credits: James Niccolai A decade ago no one could have seen this coming.

More information

Architecting for Failure in a Containerized World. Tom Faulhaber Infolace

Architecting for Failure in a Containerized World. Tom Faulhaber Infolace Architecting for Failure in a Containerized World Tom Faulhaber Infolace How can container tech help us build robust systems? Key takeaway: an architectural toolkit for building robust systems with

More information

From Zero to Production Our Mesos Journey. Chien Huey DevOps Engineer XO Group, Inc.

From Zero to Production Our Mesos Journey. Chien Huey DevOps Engineer XO Group, Inc. From Zero to Production Our Mesos Journey Chien Huey DevOps Engineer XO Group, Inc. The Gap Gap Items Getting to production DC/OS: CloudFormation Advanced Installer Vanilla Mesos: Non-DC/OS version Package

More information

Fault Domains in Mesos. Vinod Kone

Fault Domains in Mesos. Vinod Kone Fault Domains in Mesos Vinod Kone (vinodkone@apache.org) About me Apache Mesos PMC and Committer Engineering Manager for Mesos team @ Mesosphere Previously Tech Lead for Mesos team @ Twitter PhD in Computer

More information

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI /

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI / Index A ACID, 251 Actor model Akka installation, 44 Akka logos, 41 OOP vs. actors, 42 43 thread-based concurrency, 42 Agents server, 140, 251 Aggregation techniques materialized views, 216 probabilistic

More information

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4 Cloud & container monitoring 04.05.2018, Lars Michelsen Some cloud definitions Applications Data Runtime Middleware O/S Virtualization Servers Storage Networking Software-as-a-Service (SaaS) Applications

More information

How can you implement this through a script that a scheduling daemon runs daily on the application servers?

How can you implement this through a script that a scheduling daemon runs daily on the application servers? You ve been tasked with implementing an automated data backup solution for your application servers that run on Amazon EC2 with Amazon EBS volumes. You want to use a distributed data store for your backups

More information

FROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS

FROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS FROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS Carlos Sanchez @csanchez Watch online at carlossg.github.io/presentations ABOUT ME Senior So ware Engineer @ CloudBees Author of Jenkins Kubernetes plugin

More information

Zabbix on a Clouds. Another approach to a building a fault-resilient, scalable monitoring platform

Zabbix on a Clouds. Another approach to a building a fault-resilient, scalable monitoring platform Zabbix on a Clouds Another approach to a building a fault-resilient, scalable monitoring platform Preface 00:20:00 We will be discussing a few topics on how you will deploy or migrate Zabbix monitoring

More information

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo Document Sub Title Yotpo Technical Overview 07/18/2016 2015 Yotpo Contents Introduction... 3 Yotpo Architecture... 4 Yotpo Back Office (or B2B)... 4 Yotpo On-Site Presence... 4 Technologies... 5 Real-Time

More information

Enabling Low-friction Continuous Deployment via Workflows and Containers

Enabling Low-friction Continuous Deployment via Workflows and Containers Enabling Low-friction Continuous Deployment via Workflows and Containers Jon Pliske Tristan Blease Groupon Groupon? Overview 1. 2. 3. 4. 5. Architectural transformation What we are building Technology

More information

How to Keep UP Through Digital Transformation with Next-Generation App Development

How to Keep UP Through Digital Transformation with Next-Generation App Development How to Keep UP Through Digital Transformation with Next-Generation App Development Peter Sjoberg Jon Olby A Look Back, A Look Forward Dedicated, data structure dependent, inefficient, virtualized Infrastructure

More information

MapR Enterprise Hadoop

MapR Enterprise Hadoop 2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS

More information

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data

More information

Cloud + Big Data Putting it all Together

Cloud + Big Data Putting it all Together Cloud + Big Data Putting it all Together Even Solberg 2009 VMware Inc. All rights reserved 2 Big, Fast and Flexible Data Big Big Data Processing Fast OLTP workloads Flexible Document Object Big Data Analytics

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Building Durable Real-time Data Pipeline

Building Durable Real-time Data Pipeline Building Durable Real-time Data Pipeline Apache BookKeeper at Twitter @sijieg Twitter Background Layered Architecture Agenda Design Details Performance Scale @Twitter Q & A Publish-Subscribe Online services

More information

Orchestration Ownage: Exploiting Container-Centric Datacenter Platforms

Orchestration Ownage: Exploiting Container-Centric Datacenter Platforms SESSION ID: CSV-R03 Orchestration Ownage: Exploiting Container-Centric Datacenter Platforms Bryce Kunz Senior Threat Specialist Adobe Mike Mellor Director, Information Security Adobe Intro Mike Mellor

More information

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft Azure Webinar Resilient Solutions March 2017 Sander van den Hoven Principal Technical Evangelist Microsoft DX @svandenhoven 1 What is resilience? Client Client API FrontEnd Client Client Client Loadbalancer

More information

This tutorial will give you a quick start with Consul and make you comfortable with its various components.

This tutorial will give you a quick start with Consul and make you comfortable with its various components. About the Tutorial Consul is an important service discovery tool in the world of Devops. This tutorial covers in-depth working knowledge of Consul, its setup and deployment. This tutorial aims to help

More information

Docker 101 Workshop. Eric Smalling - Solution Architect, Docker

Docker 101 Workshop. Eric Smalling - Solution Architect, Docker Docker 101 Workshop Eric Smalling - Solution Architect, Docker Inc. @ericsmalling Who Am I? Eric Smalling Solution Architect Docker Customer Success Team ~25 years in software development, architecture,

More information

A Generic Microservice Architecture for Environmental Data Management

A Generic Microservice Architecture for Environmental Data Management A Generic Microservice Architecture for Environmental Data Management Clemens Düpmeier, Eric Braun, Thorsten Schlachter, Karl-Uwe Stucky, Wolfgang Suess KIT The Research University in the Helmholtz Association

More information

Deploy Like A Boss Oliver Nicholas

Deploy Like A Boss Oliver Nicholas Deploy Like A Boss Oliver Nicholas DEPLOY LIKE A BOSS THE JOURNEY FROM 2 SERVERS TO 20,000 THE DEPLOYMENT PIPELINE MARCH 1, 2015 3 UBER TECHNOLOGIES, INC BUSINESS METRICS 311 Cities 57 Countries 1,000,000+

More information

Containers and the Evolution of Computing

Containers and the Evolution of Computing Containers and the Evolution of Computing Matt Nowina Solutions Architect 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scaling Applications Order UI User UI Shipping UI Order

More information

Network Function Virtualization over Open DC/OS Yung-Han Chen

Network Function Virtualization over Open DC/OS Yung-Han Chen Network Function Virtualization over Open DC/OS Yung-Han Chen 2016.05.18 1 Outlines Network Function Virtualization (NFV) Framework Container-based Open Source Solutions for NFV Use Cases 2 NFV Architectural

More information

Table of Contents 1.1. Introduction. Overview of vsphere Integrated Containers 1.2

Table of Contents 1.1. Introduction. Overview of vsphere Integrated Containers 1.2 Table of Contents Introduction Overview of vsphere Integrated Containers 1.1 1.2 2 Overview of vsphere Integrated Containers This document provides an overview of VMware vsphere Integrated Containers.

More information

STATE OF MODERN APPLICATIONS IN THE CLOUD

STATE OF MODERN APPLICATIONS IN THE CLOUD STATE OF MODERN APPLICATIONS IN THE CLOUD 2017 Introduction The Rise of Modern Applications What is the Modern Application? Today s leading enterprises are striving to deliver high performance, highly

More information

Containerization Dockers / Mesospere. Arno Keller HPE

Containerization Dockers / Mesospere. Arno Keller HPE Containerization Dockers / Mesospere Arno Keller HPE What is the Container technology Hypervisor vs. Containers (Huis vs artement) A container doesn't "boot" an OS instead it loads the application and

More information

How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud. Santa Clara, California April 23th 25th, 2018

How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud. Santa Clara, California April 23th 25th, 2018 How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud Santa Clara, California April 23th 25th, 2018 Azure Data Service Architecture Share Cluster with SQL DB Azure Infrastructure Services Azure

More information

Migrating to Cassandra in the Cloud, the Netflix Way

Migrating to Cassandra in the Cloud, the Netflix Way Migrating to Cassandra in the Cloud, the Netflix Way Jason Brown - @jasobrown Senior Software Engineer, Netflix Tech History, 1998-2008 In the beginning, there was the webapp and a single database in a

More information

Personal Statement. Skillset I MongoDB / Cassandra / Redis / CouchDB. My name is Dale-Kurt Murray. I'm a Solutiof

Personal Statement. Skillset I MongoDB / Cassandra / Redis / CouchDB. My name is Dale-Kurt Murray. I'm a Solutiof My name is Dale-Kurt Murray. 'm a Solutiof +1 876 345 7375 Architect who loves new challenging probl :i "rite hello@dalekurtmurray.com which allows me to think outside of the box. visit www.dalekurtmurray.com

More information

Create an application with local persistent volumes

Create an application with local persistent volumes Local Persistent Volumes When you specify a local volume or volumes, tasks and their associated data are pinned to the node they are first launched on and will be relaunched on that node if they terminate.

More information

Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda

Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda Module 1: Google Cloud Platform Projects Identify project resources and quotas Explain the purpose of Google Cloud Resource

More information

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013 Design Patterns for Large- Scale Data Management Robert Hodges OSCON 2013 The Start-Up Dilemma 1. You are releasing Online Storefront V 1.0 2. It could be a complete bust 3. But it could be *really* big

More information

POWERING THE INTERNET WITH APACHE MESOS

POWERING THE INTERNET WITH APACHE MESOS Neil Conway, Niklas Nielsen, Greg Mann & Sunil Shah POWERING THE INTERNET WITH APACHE MESOS 1 MESOS: ORIGINS 2 THE BIRTH OF MESOS TWITTER TECH TALK APACHE INCUBATION The grad students working on Mesos

More information

Azure Certification BootCamp for Exam (Developer)

Azure Certification BootCamp for Exam (Developer) Azure Certification BootCamp for Exam 70-532 (Developer) Course Duration: 5 Days Course Authored by CloudThat Description Microsoft Azure is a cloud computing platform and infrastructure created for building,

More information

SECURING A MARATHON INSTALLATION 2016

SECURING A MARATHON INSTALLATION 2016 MesosCon EU 2016 - Gastón Kleiman SECURING A MARATHON INSTALLATION 2016 2016 Mesosphere, Inc. All Rights Reserved. 1 Gastón Kleiman Distributed Systems Engineer Marathon/Mesos contributor gaston@mesosphere.io

More information

We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info

We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : Storage & Database Services : Introduction

More information

Best practices for OpenShift high-availability deployment field experience

Best practices for OpenShift high-availability deployment field experience Best practices for OpenShift high-availability deployment field experience Orgad Kimchi, Red Hat, Christina Wong, NuoDB, Ariff Kassam, NuoDB Traditional disaster recovery means maintaining underutilized

More information

Become a MongoDB Replica Set Expert in Under 5 Minutes:

Become a MongoDB Replica Set Expert in Under 5 Minutes: Become a MongoDB Replica Set Expert in Under 5 Minutes: USING PERCONA SERVER FOR MONGODB IN A FAILOVER ARCHITECTURE This solution brief outlines a way to run a MongoDB replica set for read scaling in production.

More information

Servers fail, who cares? (Answer: I do, sort of) Gregg Ulrich, #netflixcloud #cassandra12

Servers fail, who cares? (Answer: I do, sort of) Gregg Ulrich, #netflixcloud #cassandra12 Servers fail, who cares? (Answer: I do, sort of) Gregg Ulrich, Netflix @eatupmartha #netflixcloud #cassandra12 1 June 29, 2012 2 3 4 [1] 5 From the Netflix tech blog: Cassandra, our distributed cloud persistence

More information

Table of Contents 1.1. Overview. Containers, Docker, Registries vsphere Integrated Containers Engine

Table of Contents 1.1. Overview. Containers, Docker, Registries vsphere Integrated Containers Engine Table of Contents Overview Containers, Docker, Registries vsphere Integrated Containers Engine Management Portal Registry Roles and Personas 1.1 1.1.1 1.1.2 1.1.2.1 1.1.2.2 1.1.2.3 1.1.2.4 2 Overview of

More information

Scaling Marketplaces at Thumbtack QCon SF 2017

Scaling Marketplaces at Thumbtack QCon SF 2017 Scaling Marketplaces at Thumbtack QCon SF 2017 Nate Kupp Technical Infrastructure Data Eng, Experimentation, Platform Infrastructure, Security, Dev Tools Infrastructure from early beginnings You see that?

More information

Cloud I - Introduction

Cloud I - Introduction Cloud I - Introduction Chesapeake Node.js User Group (CNUG) https://www.meetup.com/chesapeake-region-nodejs-developers-group START BUILDING: CALLFORCODE.ORG 3 Agenda Cloud Offerings ( Cloud 1.0 ) Infrastructure

More information

Cisco Tetration Analytics

Cisco Tetration Analytics Cisco Tetration Analytics Enhanced security and operations with real time analytics John Joo Tetration Business Unit Cisco Systems Security Challenges in Modern Data Centers Securing applications has become

More information

CockroachDB on DC/OS. Ben Darnell, CTO, Cockroach Labs

CockroachDB on DC/OS. Ben Darnell, CTO, Cockroach Labs CockroachDB on DC/OS Ben Darnell, CTO, Cockroach Labs Agenda A cloud-native database CockroachDB on DC/OS Why CockroachDB Demo! Cloud-Native Database What is Cloud-Native? Horizontally scalable Individual

More information

Integrate MATLAB Analytics into Enterprise Applications

Integrate MATLAB Analytics into Enterprise Applications Integrate Analytics into Enterprise Applications Aurélie Urbain MathWorks Consulting Services 2015 The MathWorks, Inc. 1 Data Analytics Workflow Data Acquisition Data Analytics Analytics Integration Business

More information

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari Making Non-Distributed Databases, Distributed Ioannis Papapanagiotou, PhD Shailesh Birari Dynomite Ecosystem Dynomite - Proxy layer Dyno - Client Dynomite-manager - Ecosystem orchestrator Dynomite-explorer

More information

OpenStack Magnum Pike and the CERN cloud. Spyros

OpenStack Magnum Pike and the CERN cloud. Spyros OpenStack Magnum Pike and the CERN cloud Spyros Trigazis @strigazi OpenStack Magnum OpenStack Magnum #openstack-containers Kubernetes, Docker Swarm, Apache Mesos, DC/OS (experimental) aas Deep integration

More information

Esper EQC. Horizontal Scale-Out for Complex Event Processing

Esper EQC. Horizontal Scale-Out for Complex Event Processing Esper EQC Horizontal Scale-Out for Complex Event Processing Esper EQC - Introduction Esper query container (EQC) is the horizontal scale-out architecture for Complex Event Processing with Esper and EsperHA

More information

/ Cloud Computing. Recitation 5 February 14th, 2017

/ Cloud Computing. Recitation 5 February 14th, 2017 15-319 / 15-619 Cloud Computing Recitation 5 February 14th, 2017 1 Overview Administrative issues Office Hours, Piazza guidelines Last week s reflection Project 2.1, OLI Unit 2 modules 5 and 6 This week

More information

Armon HASHICORP

Armon HASHICORP Nomad Armon Dadgar @armon Distributed Optimistically Concurrent Scheduler Nomad Distributed Optimistically Concurrent Scheduler Nomad Schedulers map a set of work to a set of resources Work (Input) Resources

More information

Jupyter and Spark on Mesos: Best Practices. June 21 st, 2017

Jupyter and Spark on Mesos: Best Practices. June 21 st, 2017 Jupyter and Spark on Mesos: Best Practices June 2 st, 207 Agenda About me What is Spark & Jupyter Demo How Spark+Mesos+Jupyter work together Experience Q & A About me Graduated from EE @ Tsinghua Univ.

More information

Namenode HA. Sanjay Radia - Hortonworks

Namenode HA. Sanjay Radia - Hortonworks Namenode HA Sanjay Radia - Hortonworks Sanjay Radia - Background Working on Hadoop for the last 4 years Part of the original team at Yahoo Primarily worked on HDFS, MR Capacity scheduler wire protocols,

More information

Scaling Up Performance Benchmarking

Scaling Up Performance Benchmarking Scaling Up Performance Benchmarking -with SPECjbb2015 Anil Kumar Runtime Performance Architect @Intel, OSG Java Chair Monica Beckwith Runtime Performance Architect @Arm, Java Champion FaaS Serverless Frameworks

More information

This document provides instructions for upgrading a DC/OS cluster.

This document provides instructions for upgrading a DC/OS cluster. Upgrading ENTERPRISE This document provides instructions for upgrading a DC/OS cluster. If this upgrade is performed on a supported OS with all prerequisites fulfilled, this upgrade should preserve the

More information

Logging, Monitoring, and Alerting

Logging, Monitoring, and Alerting Logging, Monitoring, and Alerting Logs are a part of daily life in the DevOps world In security, we focus on particular logs to detect security anomalies and for forensic capabilities A basic logging pipeline

More information

Think Small to Scale Big

Think Small to Scale Big Think Small to Scale Big Intro to Containers for the Datacenter Admin Pete Zerger Principal Program Manager, MVP pete.zerger@cireson.com Cireson Lee Berg Blog, e-mail address, title Company Pete Zerger

More information