A Whirlwind Tour of Apache Mesos
About Herdy Senior Software Engineer at Citadel Technology Solutions (Singapore) The eternal student Find me on the internet: _hhandoko hhandoko hhandoko https://au.linkedin.com/in/herdyhandoko
Presentation Overview Problem Domains Mesos Fundamentals Mesos Frameworks Mesos in the Real-World Demo! Image source: https://mesosphere.com/wp-content/uploads/2015/04/dcossdashboard.jpg
Once Upon a Tweet I ve heard of: LAMP WIMP MEAN But what is SMACK? Source: https://twitter.com/theotown/status/643377504527495168
Mesos in One Paragraph Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. Image source: https://mesosphere.com/wp-content/themes/mesosphere/library/images/views/why-mesos/mesos-architecture.png
Mesos in One Sentence Operations / DevOps Developers / Data Scientist Next-Generation Cluster Manager Distributed Systems SDK
Mesos in One Sentence (cont d) Datacentre timesharing Image source: http://www.computersciencelab.com/computerhistory/htmlhelp/images2/ibm7094.jpg
Problem Domain: Static Partitioning Many and complex provisioning scripts Snowflake servers No automated failure handling Repartition takes hours or days
Problem Domain: Resource Management Low utilisation rate (i.e. waste) Hard to predict workload Application performance jitter Scale and capacity are coupled Image source: http://www.slideshare.net/mesosphere/apache-mesos-and-mesosphere-live-webcast-by-ceo-and-cofounder-florian-li
The Inspiration: Google Borg Top Secret orchestration system (in use since ~2004) Efficiently parcels work across Google s vast fleet of computer servers Google is building Omega (Borg vnext) Source: https://www.wired.com/2013/03/google-borg-twitter-mesos/all/
The Birth of Apache Mesos A research project at the University of California Berkeley Hindman s initial ideas from working with many-cores Intel processor (64 128 cores) Hindman teamed up with Kowinski and Zaharia who was working on software platform that work on massive data centres Twitter took a keen interest and further developed Mesos (as an opensource project) Becomes an Apache project in 2013 Source: https://www.wired.com/2013/03/google-borg-twitter-mesos/all/
Mesos Analogy to an Operating System Linux Mesos
Mesos vs Virtualization Virtualization Mesos
Mesos Architecture ZooKeeper coordinate master nodes and elect leader Mesos master manage agents and schedule Tasks Mesos agents make Offers and run Tasks
Key Concepts Frameworks Mesos understands the technical primitives of distributed computing but have no intelligence on how to do it Frameworks tell Mesos (kernel) how to run the applications A framework comprises of Scheduler and Executor Resource offers Agents advertise available resources Offers can contain user-defined attributes Resource isolation via LXC Resource allocation Roles Weights Resource Reservations
Two-tier Scheduling 1. Agents offer resources 2. Allocator decides where to offer the resources 3. Framework may accept an offer and execute a task in an agent, or 4. Framework may reject the offer and it will be passed along
App Specific Frameworks
General Purpose Framework: Marathon Container and framework orchestration platform Runs long running services (`init.d`), e.g. web applications Features High availability (active / passive) Service discovery & load balancing Health checks Event subscription REST API Image source: https://mesosphere.com/wp-content/themes/mesosphere/library/images/assets/continuous-deployment/marathon2.png
General Purpose Framework: Chronos Fault-tolerant jobs scheduler for Mesos Distributed `cron` Features Distributed and fault-tolerant Supports bash and custom executor Schedules based on ISO8601 repeating interval notation Handles jobs dependencies Image source: https://mesos.github.io/chronos/img/chronos_ui-1.png
Framework: Aurora Service orchestration framework Functionality-wise, combined Marathon + Chronos, and so much more Twitter wanted an all-in-one framework for total control Image source: http://aurora.apache.org/documentation/latest/images/components.png
BYO Framework Existing frameworks provide good coverage of most use cases (80/20) Hadoop: Batch processing Storm: Stream processing Chronos: Task scheduling Marathon / Aurora: long-running services
Custom Framework Demo!
Demo Resources Rendler Code: https://github.com/mesosphere/rendler
Mesos in Production Today
Mesos and Mesosphere Mesos is the name of the opensource Apache project Mesosphere (Mesosphere Inc.) is the company which commercializes the open source project and provides consulting services
DC/OS Demo!
Demo Resources DC/OS Installation Instructions: https://dcos.io/docs/1.7/administration/installing/cloud/packet/ Packet Hosting: https://www.packet.net Hashicorp s Terraform: https://www.hashicorp.com/terraform.html Mesosphere Tweeter App: https://github.com/mesosphere/tweeter
Predictive Scheduler: Quasar Resource efficient and QoSaware cluster manager Uses fast classification techniques in Machine Learning to profile workloads Image source: http://regmedia.co.uk/2014/02/27/quasar.jpg
Mesos on Windows Mesosphere is working with Microsoft to port Apache Mesos to work with Windows Servers Platform-specific tasks will be run on the supported nodes Image source: https://media.licdn.com/mpr/mpr/shrinknp_800_800/aaeaaqaaaaaaaarraaaajgvhztyyodflltvkzmmtnguzmi05mzrlltcyzwvlzme1ytu2ma.jpg
Fit for Purpose? Good Fit Stateless systems Web applications Spark Hadoop Poor Fit Stateful systems* Relational Database Distributed systems Cassandra *Note: Support for persistent storage volumes is under active development
Whitepapers Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R.H., Shenker, S. and Stoica, I., 2011, March. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In NSDI (Vol. 11, pp. 22-22). Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E. and Wilkes, J., 2015, April. Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems (p. 18). ACM.
Books
Last But Not Least
Thanks!