A Whirlwind Tour of Apache Mesos

Size: px
Start display at page:

Download "A Whirlwind Tour of Apache Mesos"

Transcription

1 A Whirlwind Tour of Apache Mesos

2 About Herdy Senior Software Engineer at Citadel Technology Solutions (Singapore) The eternal student Find me on the internet: _hhandoko hhandoko hhandoko

3 Presentation Overview Problem Domains Mesos Fundamentals Mesos Frameworks Mesos in the Real-World Demo! Image source:

4 Once Upon a Tweet I ve heard of: LAMP WIMP MEAN But what is SMACK? Source:

5 Mesos in One Paragraph Apache Mesos abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively. Image source:

6 Mesos in One Sentence Operations / DevOps Developers / Data Scientist Next-Generation Cluster Manager Distributed Systems SDK

7 Mesos in One Sentence (cont d) Datacentre timesharing Image source:

8 Problem Domain: Static Partitioning Many and complex provisioning scripts Snowflake servers No automated failure handling Repartition takes hours or days

9 Problem Domain: Resource Management Low utilisation rate (i.e. waste) Hard to predict workload Application performance jitter Scale and capacity are coupled Image source:

10 The Inspiration: Google Borg Top Secret orchestration system (in use since ~2004) Efficiently parcels work across Google s vast fleet of computer servers Google is building Omega (Borg vnext) Source:

11 The Birth of Apache Mesos A research project at the University of California Berkeley Hindman s initial ideas from working with many-cores Intel processor ( cores) Hindman teamed up with Kowinski and Zaharia who was working on software platform that work on massive data centres Twitter took a keen interest and further developed Mesos (as an opensource project) Becomes an Apache project in 2013 Source:

12 Mesos Analogy to an Operating System Linux Mesos

13 Mesos vs Virtualization Virtualization Mesos

14 Mesos Architecture ZooKeeper coordinate master nodes and elect leader Mesos master manage agents and schedule Tasks Mesos agents make Offers and run Tasks

15 Key Concepts Frameworks Mesos understands the technical primitives of distributed computing but have no intelligence on how to do it Frameworks tell Mesos (kernel) how to run the applications A framework comprises of Scheduler and Executor Resource offers Agents advertise available resources Offers can contain user-defined attributes Resource isolation via LXC Resource allocation Roles Weights Resource Reservations

16 Two-tier Scheduling 1. Agents offer resources 2. Allocator decides where to offer the resources 3. Framework may accept an offer and execute a task in an agent, or 4. Framework may reject the offer and it will be passed along

17 App Specific Frameworks

18 General Purpose Framework: Marathon Container and framework orchestration platform Runs long running services (`init.d`), e.g. web applications Features High availability (active / passive) Service discovery & load balancing Health checks Event subscription REST API Image source:

19 General Purpose Framework: Chronos Fault-tolerant jobs scheduler for Mesos Distributed `cron` Features Distributed and fault-tolerant Supports bash and custom executor Schedules based on ISO8601 repeating interval notation Handles jobs dependencies Image source:

20 Framework: Aurora Service orchestration framework Functionality-wise, combined Marathon + Chronos, and so much more Twitter wanted an all-in-one framework for total control Image source:

21 BYO Framework Existing frameworks provide good coverage of most use cases (80/20) Hadoop: Batch processing Storm: Stream processing Chronos: Task scheduling Marathon / Aurora: long-running services

22 Custom Framework Demo!

23 Demo Resources Rendler Code:

24 Mesos in Production Today

25 Mesos and Mesosphere Mesos is the name of the opensource Apache project Mesosphere (Mesosphere Inc.) is the company which commercializes the open source project and provides consulting services

26 DC/OS Demo!

27 Demo Resources DC/OS Installation Instructions: Packet Hosting: Hashicorp s Terraform: Mesosphere Tweeter App:

28 Predictive Scheduler: Quasar Resource efficient and QoSaware cluster manager Uses fast classification techniques in Machine Learning to profile workloads Image source:

29 Mesos on Windows Mesosphere is working with Microsoft to port Apache Mesos to work with Windows Servers Platform-specific tasks will be run on the supported nodes Image source:

30 Fit for Purpose? Good Fit Stateless systems Web applications Spark Hadoop Poor Fit Stateful systems* Relational Database Distributed systems Cassandra *Note: Support for persistent storage volumes is under active development

31 Whitepapers Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R.H., Shenker, S. and Stoica, I., 2011, March. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. In NSDI (Vol. 11, pp ). Verma, A., Pedrosa, L., Korupolu, M., Oppenheimer, D., Tune, E. and Wilkes, J., 2015, April. Large-scale cluster management at Google with Borg. In Proceedings of the Tenth European Conference on Computer Systems (p. 18). ACM.

32 Books

33 Last But Not Least

34 Thanks!

SCALING LIKE TWITTER WITH APACHE MESOS

SCALING LIKE TWITTER WITH APACHE MESOS Philip Norman & Sunil Shah SCALING LIKE TWITTER WITH APACHE MESOS 1 MODERN INFRASTRUCTURE Dan the Datacenter Operator Alice the Application Developer Doesn t sleep very well Loves automation Wants to control

More information

SAMPLE CHAPTER IN ACTION. Roger Ignazio. FOREWORD BY Florian Leibert MANNING

SAMPLE CHAPTER IN ACTION. Roger Ignazio. FOREWORD BY Florian Leibert MANNING SAMPLE CHAPTER IN ACTION Roger Ignazio FOREWORD BY Florian Leibert MANNING Mesos in Action by Roger Ignazio Chapter 1 Copyright 2016 Manning Publications brief contents PART 1 HELLO, MESOS...1 1 Introducing

More information

POWERING THE INTERNET WITH APACHE MESOS

POWERING THE INTERNET WITH APACHE MESOS Neil Conway, Niklas Nielsen, Greg Mann & Sunil Shah POWERING THE INTERNET WITH APACHE MESOS 1 MESOS: ORIGINS 2 THE BIRTH OF MESOS TWITTER TECH TALK APACHE INCUBATION The grad students working on Mesos

More information

Building a Data-Friendly Platform for a Data- Driven Future

Building a Data-Friendly Platform for a Data- Driven Future Building a Data-Friendly Platform for a Data- Driven Future Benjamin Hindman - @benh 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami BENJAMIN HINDMAN Co-founder and Chief Architect of Mesosphere,

More information

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS @unterstein @dcos @bedcon #bedcon Operating microservices with Apache Mesos and DC/OS 1 Johannes Unterstein Software Engineer @Mesosphere @unterstein @unterstein.mesosphere 2017 Mesosphere, Inc. All Rights

More information

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling Key aspects of cloud computing Cluster Scheduling 1. Illusion of infinite computing resources available on demand, eliminating need for up-front provisioning. The elimination of an up-front commitment

More information

Building/Running Distributed Systems with Apache Mesos

Building/Running Distributed Systems with Apache Mesos Building/Running Distributed Systems with Apache Mesos Philly ETE April 8, 2015 Benjamin Hindman @benh $ whoami 2007-2012 2009-2010 - 2014 my other computer is a datacenter my other computer is a datacenter

More information

Scale your Docker containers with Mesos

Scale your Docker containers with Mesos Scale your Docker containers with Mesos Timothy Chen tim@mesosphere.io About me: - Distributed Systems Architect @ Mesosphere - Lead Containerization engineering - Apache Mesos, Drill PMC / Committer

More information

利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC

利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC 利用 Mesos 打造高延展性 Container 環境 Frank, Microsoft MTC About Me Developer @ Yahoo! DevOps @ HTC Technical Architect @ MSFT Agenda About Docker Manage containers Apache Mesos Mesosphere DC/OS application = application

More information

A Platform for Fine-Grained Resource Sharing in the Data Center

A Platform for Fine-Grained Resource Sharing in the Data Center Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy Katz, Scott Shenker, Ion Stoica University of California,

More information

Mesosphere and the Enterprise: Run Your Applications on Apache Mesos. Steve Wong Open Source Engineer {code} by Dell

Mesosphere and the Enterprise: Run Your Applications on Apache Mesos. Steve Wong Open Source Engineer {code} by Dell Mesosphere and the Enterprise: Run Your Applications on Apache Mesos Steve Wong Open Source Engineer {code} by Dell EMC @cantbewong Open source at Dell EMC {code} by Dell EMC is a group of passionate open

More information

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling

Key aspects of cloud computing. Towards fuller utilization. Two main sources of resource demand. Cluster Scheduling Key aspects of cloud computing Cluster Scheduling 1. Illusion of infinite computing resources available on demand, eliminating need for up-front provisioning. The elimination of an up-front commitment

More information

Deploying Applications on DC/OS

Deploying Applications on DC/OS Mesosphere Datacenter Operating System Deploying Applications on DC/OS Keith McClellan - Technical Lead, Federal Programs keith.mcclellan@mesosphere.com V6 THE FUTURE IS ALREADY HERE IT S JUST NOT EVENLY

More information

AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS

AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS Sunil Shah AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS 1 THE DATACENTER OPERATING SYSTEM (DCOS) 2 DCOS INTRODUCTION The Mesosphere Datacenter Operating System (DCOS) is a distributed operating

More information

Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS

Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS ContainerCon @ Open Source Summit North America 2017 Elizabeth K. Joseph @pleia2 1 Elizabeth K. Joseph, Developer Advocate

More information

Mesosphere and Percona Server for MongoDB. Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere)

Mesosphere and Percona Server for MongoDB. Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere) Mesosphere and Percona Server for MongoDB Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere) Mesosphere DC/OS MICROSERVICES, CONTAINERS, & DEV TOOLS

More information

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017. Dublin Apache Kafka Meetup, 30 August 2017 The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Joseph @pleia2 * ASF projects 1 Elizabeth K. Joseph, Developer Advocate Developer Advocate

More information

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS APACHE MESOS NYC MEETUP SEPTEMBER 22, 2016 CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS WHO WE ARE ROGER IGNAZIO SUNIL SHAH Tech Lead at Mesosphere @rogerignazio Product Manager at Mesosphere @ssk2

More information

MESOS A State-Of-The-Art Container Orchestrator Mesosphere, Inc. All Rights Reserved. 1

MESOS A State-Of-The-Art Container Orchestrator Mesosphere, Inc. All Rights Reserved. 1 MESOS A State-Of-The-Art Container Orchestrator 2016 Mesosphere, Inc. All Rights Reserved. 1 About me Jie Yu (@jie_yu) Tech Lead at Mesosphere Mesos PMC member and committer Formerly worked at Twitter

More information

Mesosphere and Percona Server for MongoDB. Jeff Sandstrom, Product Manager (Percona) Ravi Yadav, Tech. Partnerships Lead (Mesosphere)

Mesosphere and Percona Server for MongoDB. Jeff Sandstrom, Product Manager (Percona) Ravi Yadav, Tech. Partnerships Lead (Mesosphere) Mesosphere and Percona Server for MongoDB Jeff Sandstrom, Product Manager (Percona) Ravi Yadav, Tech. Partnerships Lead (Mesosphere) Mesosphere DC/OS MICROSERVICES, CONTAINERS, & DEV TOOLS DATA SERVICES,

More information

The Datacenter Needs an Operating System

The Datacenter Needs an Operating System UC BERKELEY The Datacenter Needs an Operating System Anthony D. Joseph LASER Summer School September 2013 My Talks at LASER 2013 1. AMP Lab introduction 2. The Datacenter Needs an Operating System 3. Mesos,

More information

CONTINUOUS DELIVERY WITH DC/OS AND JENKINS

CONTINUOUS DELIVERY WITH DC/OS AND JENKINS SOFTWARE ARCHITECTURE NOVEMBER 15, 2016 CONTINUOUS DELIVERY WITH DC/OS AND JENKINS AGENDA Presentation Introduction to Apache Mesos and DC/OS Components that make up modern infrastructure Running Jenkins

More information

@joerg_schad Nightmares of a Container Orchestration System

@joerg_schad Nightmares of a Container Orchestration System @joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1 Jörg Schad Distributed Systems Engineer @joerg_schad Jan Repnak Support Engineer/ Solution Architect

More information

Container Orchestration on Amazon Web Services. Arun

Container Orchestration on Amazon Web Services. Arun Container Orchestration on Amazon Web Services Arun Gupta, @arungupta Docker Workflow Development using Docker Docker Community Edition Docker for Mac/Windows/Linux Monthly edge and quarterly stable

More information

The Emergence of the Datacenter Developer. Tobi Knaup, Co-Founder & CTO at

The Emergence of the Datacenter Developer. Tobi Knaup, Co-Founder & CTO at The Emergence of the Datacenter Developer Tobi Knaup, Co-Founder & CTO at Mesosphere @superguenter A Brief History of Operating Systems 2 1950 s Mainframes Punchcards No operating systems Time Sharing

More information

Using DC/OS for Continuous Delivery

Using DC/OS for Continuous Delivery Using DC/OS for Continuous Delivery DevPulseCon 2017 Elizabeth K. Joseph, @pleia2 Mesosphere 1 Elizabeth K. Joseph, Developer Advocate, Mesosphere 15+ years working in open source communities 10+ years

More information

Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS Mesosphere, Inc. All Rights Reserved.

Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS Mesosphere, Inc. All Rights Reserved. Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS 1 Introduction MOBILE, SOCIAL & CLOUD ARE RAISING CUSTOMER EXPECTATIONS We need a way to deliver software so fast that our

More information

Introduction to Mesos and the Datacenter Operating System

Introduction to Mesos and the Datacenter Operating System Introduction to Mesos and the Datacenter Operating System Artem Harutyunyan (artem@mesosphere.io) 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami ARTEM HARUTYUNYAN ALICE Offline (2004-2010) AliEn

More information

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Storm at Twitter Twitter Web Analytics Before Storm Queues Workers Example (simplified) Example Workers schemify tweets and

More information

Advantages of using DC/OS Azure infrastructure and the implementation architecture Bill of materials used to construct DC/OS and the ACS clusters

Advantages of using DC/OS Azure infrastructure and the implementation architecture Bill of materials used to construct DC/OS and the ACS clusters Reference implementation: The Azure Container Service DC/OS is a distributed operating system powered by Apache Mesos that treats collections of CPUs, RAM, networking and so on as a distributed kernel

More information

Container 2.0. Container: check! But what about persistent data, big data or fast data?!

Container 2.0. Container: check! But what about persistent data, big data or fast data?! @unterstein @joerg_schad @dcos @jaxdevops Container 2.0 Container: check! But what about persistent data, big data or fast data?! 1 Jörg Schad Distributed Systems Engineer @joerg_schad Johannes Unterstein

More information

Networking & Security for Mesos

Networking & Security for Mesos Sponsored by Networking & Security for Mesos AN IP FOR EVERY CONTAINER AND MORE! Christopher Liljenstolpe February 24, 2016 The #1 Challenge for Cloud? Recent data breaches due to hacking or poor security

More information

Servers & Developers. Julian Nadeau Production Engineer

Servers & Developers. Julian Nadeau Production Engineer Servers & Developers Julian Nadeau Production Engineer Provisioning & Orchestration of Servers Setting a server up Packer - one server at a time Chef - all servers at once Containerization What are Containers?

More information

Deploy Like A Boss Oliver Nicholas

Deploy Like A Boss Oliver Nicholas Deploy Like A Boss Oliver Nicholas DEPLOY LIKE A BOSS THE JOURNEY FROM 2 SERVERS TO 20,000 THE DEPLOYMENT PIPELINE MARCH 1, 2015 3 UBER TECHNOLOGIES, INC BUSINESS METRICS 311 Cities 57 Countries 1,000,000+

More information

Large-scale cluster management at Google with Borg

Large-scale cluster management at Google with Borg Large-scale cluster management at Google with Borg Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, John Wilkes Google Inc. Slides heavily derived from John Wilkes s presentation

More information

what is cloud computing?

what is cloud computing? what is cloud computing? (Private) Cloud Computing with Mesos at Twi9er Benjamin Hindman @benh scalable virtualized self-service utility managed elastic economic pay-as-you-go what is cloud computing?

More information

Mesos: Mul)programing for Datacenters

Mesos: Mul)programing for Datacenters Mesos: Mul)programing for Datacenters Ion Stoica Joint work with: Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy Katz, ScoC Shenker, UC BERKELEY Mo)va)on Rapid innovaeon

More information

An Introduction to Apache Spark

An Introduction to Apache Spark An Introduction to Apache Spark 1 History Developed in 2009 at UC Berkeley AMPLab. Open sourced in 2010. Spark becomes one of the largest big-data projects with more 400 contributors in 50+ organizations

More information

Processing of big data with Apache Spark

Processing of big data with Apache Spark Processing of big data with Apache Spark JavaSkop 18 Aleksandar Donevski AGENDA What is Apache Spark? Spark vs Hadoop MapReduce Application Requirements Example Architecture Application Challenges 2 WHAT

More information

Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center

Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman Andrew Konwinski Matei Zaharia Ali Ghodsi Anthony D. Joseph Randy H. Katz Scott Shenker Ion Stoica Electrical Engineering

More information

DATA SCIENCE USING SPARK: AN INTRODUCTION

DATA SCIENCE USING SPARK: AN INTRODUCTION DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data

More information

REAL-TIME ANALYTICS WITH APACHE STORM

REAL-TIME ANALYTICS WITH APACHE STORM REAL-TIME ANALYTICS WITH APACHE STORM Mevlut Demir PhD Student IN TODAY S TALK 1- Problem Formulation 2- A Real-Time Framework and Its Components with an existing applications 3- Proposed Framework 4-

More information

How Container Schedulers and Software-based Storage will Change the Cloud

How Container Schedulers and Software-based Storage will Change the Cloud How Container Schedulers and Software-based Storage will Change the Cloud David vonthenen {code} by Dell EMC @dvonthenen http://dvonthenen.com github.com/dvonthenen Agenda Review of Software-based Storage

More information

Twitter data Analytics using Distributed Computing

Twitter data Analytics using Distributed Computing Twitter data Analytics using Distributed Computing Uma Narayanan Athrira Unnikrishnan Dr. Varghese Paul Dr. Shelbi Joseph Research Scholar M.tech Student Professor Assistant Professor Dept. of IT, SOE

More information

Fault Domains in Mesos. Vinod Kone

Fault Domains in Mesos. Vinod Kone Fault Domains in Mesos Vinod Kone (vinodkone@apache.org) About me Apache Mesos PMC and Committer Engineering Manager for Mesos team @ Mesosphere Previously Tech Lead for Mesos team @ Twitter PhD in Computer

More information

Practical Considerations for Multi- Level Schedulers. Benjamin

Practical Considerations for Multi- Level Schedulers. Benjamin Practical Considerations for Multi- Level Schedulers Benjamin Hindman @benh agenda 1 multi- level scheduling (scheduler activations) 2 intra- process multi- level scheduling (Lithe) 3 distributed multi-

More information

Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica. University of California, Berkeley nsdi 11

Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica. University of California, Berkeley nsdi 11 Dominant Resource Fairness: Fair Allocation of Multiple Resource Types Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica University of California, Berkeley nsdi 11

More information

STORM AND LOW-LATENCY PROCESSING.

STORM AND LOW-LATENCY PROCESSING. STORM AND LOW-LATENCY PROCESSING Low latency processing Similar to data stream processing, but with a twist Data is streaming into the system (from a database, or a netk stream, or an HDFS file, or ) We

More information

Supporting GPUs in Docker Containers on Apache Mesos

Supporting GPUs in Docker Containers on Apache Mesos Supporting GPUs in Docker Containers on Apache Mesos MesosCon Europe - 2016 Kevin Klues Senior Software Engineer Mesosphere Yubo Li Staff Researcher IBM Research China Kevin Klues Yubo Li Kevin Klues is

More information

APACHE COTTON. MySQL on Mesos. Yan Xu xujyan

APACHE COTTON. MySQL on Mesos. Yan Xu xujyan APACHE COTTON MySQL on Mesos Yan Xu xujyan 1 SHORT HISTORY Mesos: cornerstone of Twitter s compute platform. MySQL: backbone of Twitter s data platform. Mysos: started as a hackweek project @twitter. Apache

More information

Containerization Dockers / Mesospere. Arno Keller HPE

Containerization Dockers / Mesospere. Arno Keller HPE Containerization Dockers / Mesospere Arno Keller HPE What is the Container technology Hypervisor vs. Containers (Huis vs artement) A container doesn't "boot" an OS instead it loads the application and

More information

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI /

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI / Index A ACID, 251 Actor model Akka installation, 44 Akka logos, 41 OOP vs. actors, 42 43 thread-based concurrency, 42 Agents server, 140, 251 Aggregation techniques materialized views, 216 probabilistic

More information

MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET

MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET Roger Ignazio PuppetConf 2015 MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET 2015 Mesosphere, Inc. All Rights Reserved. 1 $(whoami) ABOUT ME Roger Ignazio Infrastructure Automation Engineer @ Mesosphere

More information

An Introduction to Kubernetes

An Introduction to Kubernetes 8.10.2016 An Introduction to Kubernetes Premys Kafka premysl.kafka@hpe.com kafkapre https://github.com/kafkapre { History }???? - Virtual Machines 2008 - Linux containers (LXC) 2013 - Docker 2013 - CoreOS

More information

Spark Overview. Professor Sasu Tarkoma.

Spark Overview. Professor Sasu Tarkoma. Spark Overview 2015 Professor Sasu Tarkoma www.cs.helsinki.fi Apache Spark Spark is a general-purpose computing framework for iterative tasks API is provided for Java, Scala and Python The model is based

More information

An Enhanced Approach for Resource Management Optimization in Hadoop

An Enhanced Approach for Resource Management Optimization in Hadoop An Enhanced Approach for Resource Management Optimization in Hadoop R. Sandeep Raj 1, G. Prabhakar Raju 2 1 MTech Student, Department of CSE, Anurag Group of Institutions, India 2 Associate Professor,

More information

Mesos: A Pla+orm for Fine- Grained Resource Sharing in the Data Center

Mesos: A Pla+orm for Fine- Grained Resource Sharing in the Data Center Mesos: A Pla+orm for Fine- Grained Resource Sharing in the Data Center Ion Stoica, UC Berkeley Joint work with: A. Ghodsi, B. Hindman, A. Joseph, R. Katz, A. Konwinski, S. Shenker, and M. Zaharia Challenge

More information

SECURING A MARATHON INSTALLATION 2016

SECURING A MARATHON INSTALLATION 2016 MesosCon EU 2016 - Gastón Kleiman SECURING A MARATHON INSTALLATION 2016 2016 Mesosphere, Inc. All Rights Reserved. 1 Gastón Kleiman Distributed Systems Engineer Marathon/Mesos contributor gaston@mesosphere.io

More information

Resilient Distributed Datasets

Resilient Distributed Datasets Resilient Distributed Datasets A Fault- Tolerant Abstraction for In- Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin,

More information

Before proceeding with this tutorial, you must have a good understanding of Core Java and any of the Linux flavors.

Before proceeding with this tutorial, you must have a good understanding of Core Java and any of the Linux flavors. About the Tutorial Storm was originally created by Nathan Marz and team at BackType. BackType is a social analytics company. Later, Storm was acquired and open-sourced by Twitter. In a short time, Apache

More information

Important DevOps Technologies (3+2+3days) for Deployment

Important DevOps Technologies (3+2+3days) for Deployment Important DevOps Technologies (3+2+3days) for Deployment DevOps is the blending of tasks performed by a company's application development and systems operations teams. The term DevOps is being used in

More information

DC/OS Metrics. (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise. Nick Parker at..

DC/OS Metrics. (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise. Nick Parker at.. DC/OS Metrics (formerly known as Project Ambrose) Application and Resource Metrics in DC/OS Enterprise Nick Parker at.. 1 Introduction Nick Parker DC/OS Slack: chat.dcos.io DC/OS Mailing List: users@dcos.io

More information

How to Keep UP Through Digital Transformation with Next-Generation App Development

How to Keep UP Through Digital Transformation with Next-Generation App Development How to Keep UP Through Digital Transformation with Next-Generation App Development Peter Sjoberg Jon Olby A Look Back, A Look Forward Dedicated, data structure dependent, inefficient, virtualized Infrastructure

More information

BASIC INTER/INTRA IPC. Operating System (Linux, Windows) HARDWARE. Scheduling Framework (Mesos, YARN, etc) HERON S GENERAL-PURPOSE ARCHITECTURE

BASIC INTER/INTRA IPC. Operating System (Linux, Windows) HARDWARE. Scheduling Framework (Mesos, YARN, etc) HERON S GENERAL-PURPOSE ARCHITECTURE 217 IEEE 33rd International Conference on Data Engineering Twitter Heron: Towards Extensible Streaming Engines Maosong Fu t, Ashvin Agrawal m, Avrilia Floratou m, Bill Graham t, Andrew Jorgensen t Mark

More information

Stream Processing on IoT Devices using Calvin Framework

Stream Processing on IoT Devices using Calvin Framework Stream Processing on IoT Devices using Calvin Framework by Ameya Nayak A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science Supervised

More information

Reactive App using Actor model & Apache Spark. Rahul Kumar Software

Reactive App using Actor model & Apache Spark. Rahul Kumar Software Reactive App using Actor model & Apache Spark Rahul Kumar Software Developer @rahul_kumar_aws About Sigmoid We build realtime & big data systems. OUR CUSTOMERS Agenda Big Data - Intro Distributed Application

More information

Scalable Streaming Analytics

Scalable Streaming Analytics Scalable Streaming Analytics KARTHIK RAMASAMY @karthikz TALK OUTLINE BEGIN I! II ( III b Overview Storm Overview Storm Internals IV Z V K Heron Operational Experiences END WHAT IS ANALYTICS? according

More information

Intra-cluster Replication for Apache Kafka. Jun Rao

Intra-cluster Replication for Apache Kafka. Jun Rao Intra-cluster Replication for Apache Kafka Jun Rao About myself Engineer at LinkedIn since 2010 Worked on Apache Kafka and Cassandra Database researcher at IBM Outline Overview of Kafka Kafka architecture

More information

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized

More information

Webinar Series TMIP VISION

Webinar Series TMIP VISION Webinar Series TMIP VISION TMIP provides technical support and promotes knowledge and information exchange in the transportation planning and modeling community. Today s Goals To Consider: Parallel Processing

More information

Network Function Virtualization over Open DC/OS Yung-Han Chen

Network Function Virtualization over Open DC/OS Yung-Han Chen Network Function Virtualization over Open DC/OS Yung-Han Chen 2016.05.18 1 Outlines Network Function Virtualization (NFV) Framework Container-based Open Source Solutions for NFV Use Cases 2 NFV Architectural

More information

Big Data Security. Facing the challenge

Big Data Security. Facing the challenge Big Data Security Facing the challenge Experience the presentation xlic.es/v/e98605 About me Father of a 5 year old child Technical leader in Architecture and Security team at Stratio Sailing skipper 3

More information

Analytic Cloud with. Shelly Garion. IBM Research -- Haifa IBM Corporation

Analytic Cloud with. Shelly Garion. IBM Research -- Haifa IBM Corporation Analytic Cloud with Shelly Garion IBM Research -- Haifa 2014 IBM Corporation Why Spark? Apache Spark is a fast and general open-source cluster computing engine for big data processing Speed: Spark is capable

More information

Page 1. Goals for Today" Background of Cloud Computing" Sources Driving Big Data" CS162 Operating Systems and Systems Programming Lecture 24

Page 1. Goals for Today Background of Cloud Computing Sources Driving Big Data CS162 Operating Systems and Systems Programming Lecture 24 Goals for Today" CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing" Distributed systems Cloud Computing programming paradigms Cloud Computing OS December 2, 2013 Anthony

More information

Personal Statement. Skillset I MongoDB / Cassandra / Redis / CouchDB. My name is Dale-Kurt Murray. I'm a Solutiof

Personal Statement. Skillset I MongoDB / Cassandra / Redis / CouchDB. My name is Dale-Kurt Murray. I'm a Solutiof My name is Dale-Kurt Murray. 'm a Solutiof +1 876 345 7375 Architect who loves new challenging probl :i "rite hello@dalekurtmurray.com which allows me to think outside of the box. visit www.dalekurtmurray.com

More information

Distributed CI: Scaling Jenkins on Mesos and Marathon. Roger Ignazio Puppet Labs, Inc. MesosCon 2015 Seattle, WA

Distributed CI: Scaling Jenkins on Mesos and Marathon. Roger Ignazio Puppet Labs, Inc. MesosCon 2015 Seattle, WA Distributed CI: Scaling Jenkins on Mesos and Marathon Roger Ignazio Puppet Labs, Inc. MesosCon 2015 Seattle, WA About Me Roger Ignazio QE Automation Engineer Puppet Labs, Inc. @rogerignazio Mesos In Action

More information

Onto Petaflops with Kubernetes

Onto Petaflops with Kubernetes Onto Petaflops with Kubernetes Vishnu Kannan Google Inc. vishh@google.com Key Takeaways Kubernetes can manage hardware accelerators at Scale Kubernetes provides a playground for ML ML journey with Kubernetes

More information

Scheduling Applications at Scale

Scheduling Applications at Scale Scheduling Applications at Scale Meeting Tomorrow's Application Needs, Today http://1stchoicesportsrehab.com/wp-content/uploads/2012/05/calendar.jpg SETH VARGO @sethvargo Globally Distributed Optimistically

More information

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION

More information

Kubernetes: Integration vs Native Solution

Kubernetes: Integration vs Native Solution Kubernetes: Integration vs Native Solution Table of Contents 22 Table of Contents 01 Introduction...3 02 DC/OS...4 03 Docker Enterprise...7 04 Rancher...10 05 Azure...13 06 Conclusion...15 3 01 Introduction

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

Cloud Computing & Visualization

Cloud Computing & Visualization Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International

More information

MapReduce, Hadoop and Spark. Bompotas Agorakis

MapReduce, Hadoop and Spark. Bompotas Agorakis MapReduce, Hadoop and Spark Bompotas Agorakis Big Data Processing Most of the computations are conceptually straightforward on a single machine but the volume of data is HUGE Need to use many (1.000s)

More information

Everything You Ever Wanted To Know About Resource Scheduling... Almost

Everything You Ever Wanted To Know About Resource Scheduling... Almost logo Everything You Ever Wanted To Know About Resource Scheduling... Almost Tim Hockin Senior Staff Software Engineer, Google @thockin Who is thockin? Founding member of Kubernetes

More information

Distributed ETL. A lightweight, pluggable, and scalable ingestion service for real-time data. Joe Wang

Distributed ETL. A lightweight, pluggable, and scalable ingestion service for real-time data. Joe Wang A lightweight, pluggable, and scalable ingestion service for real-time data ABSTRACT This paper provides the motivation, implementation details, and evaluation of a lightweight distributed extract-transform-load

More information

COSC 6339 Big Data Analytics. Introduction to Spark. Edgar Gabriel Fall What is SPARK?

COSC 6339 Big Data Analytics. Introduction to Spark. Edgar Gabriel Fall What is SPARK? COSC 6339 Big Data Analytics Introduction to Spark Edgar Gabriel Fall 2018 What is SPARK? In-Memory Cluster Computing for Big Data Applications Fixes the weaknesses of MapReduce Iterative applications

More information

arxiv: v1 [cs.ro] 2 May 2018

arxiv: v1 [cs.ro] 2 May 2018 Avalon: Building an Operating System for Robotcenter Yuan Xu, Zhiyuan Yan, Sa Wang, Cheng Yang, Qingsai Xiao and Yungang Bao arxiv:1805.00745v1 [cs.ro] 2 May 2018 Abstract This paper envisions a scenario

More information

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC https://ignite.apache.org @apacheignite @dsetrakyan Agenda About In- Memory Computing Apache Ignite

More information

Real-time personal trainer on the SMACK Jan Anirvan Chakraborty

Real-time personal trainer on the SMACK Jan Anirvan Chakraborty Real-time personal trainer on the SMACK stack @honzam399 Jan Machacek @anirvan_c Anirvan Chakraborty Automated personal trainer - muvr Suggests the sequence of exercise sessions Suggests exercises in a

More information

Armon HASHICORP

Armon HASHICORP Nomad Armon Dadgar @armon Cluster Manager Scheduler Nomad Cluster Manager Scheduler Nomad Schedulers map a set of work to a set of resources Work (Input) Resources Web Server -Thread 1 Web Server -Thread

More information

IBM Planning Analytics Workspace Local Distributed Soufiane Azizi. IBM Planning Analytics

IBM Planning Analytics Workspace Local Distributed Soufiane Azizi. IBM Planning Analytics IBM Planning Analytics Workspace Local Distributed Soufiane Azizi IBM Planning Analytics IBM Canada - Cognos Ottawa Lab. IBM Planning Analytics Agenda 1. Demo PAW High Availability on a Prebuilt Swarm

More information

Distributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere

Distributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere Distributed Data on Distributed Infrastructure Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere Kunal Kusoorkar Director Solutions Engineering, ArangoDB @neunhoef Jörg Schad Claudius

More information

CSE 444: Database Internals. Lecture 23 Spark

CSE 444: Database Internals. Lecture 23 Spark CSE 444: Database Internals Lecture 23 Spark References Spark is an open source system from Berkeley Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei

More information

Jupyter and Spark on Mesos: Best Practices. June 21 st, 2017

Jupyter and Spark on Mesos: Best Practices. June 21 st, 2017 Jupyter and Spark on Mesos: Best Practices June 2 st, 207 Agenda About me What is Spark & Jupyter Demo How Spark+Mesos+Jupyter work together Experience Q & A About me Graduated from EE @ Tsinghua Univ.

More information

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423

More information

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Basic info Open sourced September 19th Implementation is 15,000 lines of code Used by over 25 companies >2700 watchers on Github

More information

Orchestration Ownage: Exploiting Container-Centric Datacenter Platforms

Orchestration Ownage: Exploiting Container-Centric Datacenter Platforms SESSION ID: CSV-R03 Orchestration Ownage: Exploiting Container-Centric Datacenter Platforms Bryce Kunz Senior Threat Specialist Adobe Mike Mellor Director, Information Security Adobe Intro Mike Mellor

More information

Integrating Apache Mesos with Science Gateways via Apache Airavata

Integrating Apache Mesos with Science Gateways via Apache Airavata Integrating Apache Mesos with Science Gateways via Apache Airavata Organization: Apache Software Foundation Abstract: Science Gateways federate resources from multiple organizations. Most gateways solve

More information

PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP

PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP ISSN: 0976-2876 (Print) ISSN: 2250-0138 (Online) PROFILING BASED REDUCE MEMORY PROVISIONING FOR IMPROVING THE PERFORMANCE IN HADOOP T. S. NISHA a1 AND K. SATYANARAYAN REDDY b a Department of CSE, Cambridge

More information

UNIFY DATA AT MEMORY SPEED. Haoyuan (HY) Li, Alluxio Inc. VAULT Conference 2017

UNIFY DATA AT MEMORY SPEED. Haoyuan (HY) Li, Alluxio Inc. VAULT Conference 2017 UNIFY DATA AT MEMORY SPEED Haoyuan (HY) Li, CEO @ Alluxio Inc. VAULT Conference 2017 March 2017 HISTORY Started at UC Berkeley AMPLab In Summer 2012 Originally named as Tachyon Rebranded to Alluxio in

More information