Scalable task distribution with Scala, Akka and Mesos. Dario
|
|
- Cleopatra Terry
- 6 years ago
- Views:
Transcription
1 Scalable task distribution with Scala, Akka and Mesos Dario
2 What is Mesos? 2
3 What is Mesos? Apache open source project Distributed systems kernel Multi resource scheduler (CPU, Memory, Ports, Disk) Scalable to 10,000s of nodes Fault tolerant First class Docker support 3
4 Distributed Systems Kernel Runs on every node Aggregates all resources in the cluster Provides applications with APIs for resource management and scheduling Offers resources to applications in a fair (configurable) manner 4
5 Fault Tolerance 5
6 Why should I bother? Running many tasks on many machines Scaling up and down the number of tasks and machines Handling failures in the cluster Better resource utilization through multi-tenancy 6
7 Static partitioning Waste of resources! 7
8 Virtualization Operational overhead! 8
9 Multi tenancy + automatic task distribution 9
10 Multi tenancy + automatic task distribution 10
11 Say hi to Marathon 11
12 What is Marathon? Distributed init system Fault tolerant Manages deployments Checks health of applications Manages task and machine failures Runs Docker containers 12
13 Deploying apps with Marathon curl -XPOST -H Content-Type: application/json -d { id : my-app, cmd : python -m SimpleHTTPServer $PORT0, cpus : 1, mem : 64, instances : 10, ports : [0] } 13
14 Deploying apps with Marathon { steps : [ { actions : [ { action : StartApplication, app : /my-app } ] } } 14
15 Deploying apps with Marathon 15
16 Deploying apps with Marathon 16
17 Deploying apps with Marathon Scheduler Actor 17
18 Deploying apps with Marathon Scheduler Actor notifies Deployment Manager 18
19 Deploying apps with Marathon Scheduler Actor notifies Deployment Manager starts Deployment Actor 19
20 Deploying apps with Marathon Scheduler Actor Deployment Manager Deployment Actor notifies starts starts AppStart Actor 20
21 Deploying apps with Marathon EventStream Scheduler Actor Deployment Manager Deployment Actor notifies starts starts AppStart Actor 21
22 Deploying groups with Marathon curl -XPOST -H Content-Type: application/json -d { id : my-group, apps : [ { id : my-app, }, { id : my-other-app, }, { id : yet-another-app, } ] } 22
23 Deploying apps with Marathon { steps : [ { actions : [ { action : StartApplication, app : /my-app }, { action : StartApplication, app : /my-other-app }, { action : StartApplication, app : /yet-another-app } ] } } 23
24 Deploying groups with Marathon EventStream Scheduler Actor Deployment Manager Deployment Actor notifies starts starts AppStart AppStart Actor AppStart Actor Actor 24
25 Dependencies between apps curl -XPOST -H Content-Type: application/json -d { id : my-group, apps : [ { id : my-app, dependencies : [ my-other-app ] }, { id : my-other-app, dependencies : [ yet-another-app ] }, { id : yet-another-app, } ] } 25
26 Deploying dependent groups with Marathon { steps : [ { actions : [{ action : StartApplication, app : /yet-another-app }] }, { actions : [{ action : StartApplication, app : /my-other-app }] }, { actions : [ { action : StartApplication, app : /my-app }] } ] } 26
27 Deploying dependent groups with Marathon EventStream Scheduler Actor Deployment Manager Deployment Actor notifies starts starts AppStart Actor done 27
28 Deploying dependent groups with Marathon = depends on A B C D E 28
29 Deploying dependent groups with Marathon 1. A B C Layered deployment 2. D E 29
30 Can we do better? 30
31 We can do better! = notifies A B C D E 31
32 Writing custom frameworks 32
33 Bindings for many languages C++ Java Scala Clojure Haskell Go Python 33
34 Simple API registered reregistered resourceoffers offerrescinded statusupdate frameworkmessage disconnected slavelost executorlost error 34
35 akka-mesos (work in progress) Non-blocking Stream of messages instead of callbacks Wrappers around the protobuf messages No JNI 35
36 Auto scaling applications 36
37 Auto scaling applications 37
38 Auto scaling applications system.actorof(clustersingletonmanager.props( singletonprops = Props(classOf[MesosScheduler]), singletonname = "consumer", terminationmessage = PoisonPill, role = Some("master")), name = "mesos-scheduler") 38
39 Auto scaling applications val frameworkinfo = FrameworkInfo( name = "autoscale", user = "user", failovertimeout = Some(5.minutes), checkpoint = Some(true) ) 39
40 Auto scaling applications framework = Mesos(context.system).registerFramework( Success(PID(" ", 5050, "master")), frameworkinfo) implicit val materializer = ActorFlowMaterializer() framework.schedulermessages.runforeach (self! _) Cluster(context.system).subscribe(self, classof[clustermetricschanged]) 40
41 Auto scaling applications case ClusterMetricsChanged(metrics) => val (loadsum, heapsum) = metrics.foldleft((0.0, 0.0)) { case ((loadacc, heapacc), metric) => val load = metric.metrics.collectfirst { case x if x.name == "system-load-average" => x.value.doublevalue() } getorelse 0.0 val heap = metric.metrics.collectfirst { case x if x.name == "heap-memory-used" => x.value.doublevalue() } getorelse 0.0 (loadacc + load, heapacc + heap) } val loadavg = loadsum / metrics.size val heapavg = heapsum / metrics.size 41
42 Auto scaling applications if ((loadavg > upperloadthreshold heapavg > upperheapthreshold) && scalebackoff.isoverdue()) { log.info("scaling up!") scaleup = true scalebackoff = 30.seconds.fromNow } 42
43 Auto scaling applications else if ((loadavg < lowerloadthreshold && heapavg < lowerheapthreshold) && scalebackoff.isoverdue() && runningtasks.nonempty) { log.info("scaling down!") scaleup = false val task = runningtasks.head framework.driver.killtask(task) scalebackoff = 30.seconds.fromNow } 43
44 Auto scaling applications case ResourceOffers(offers) if scaleup => val matchingoffer = findmatchingoffer(offers) matchingoffer foreach { offer => val taskinfo = buildtaskinfo(offer) framework.driver.launchtasks(seq(taskinfo), Seq(offer.id)) scaleup = false } (offers diff matchingoffer.toseq).foreach { offer => framework.driver.declineoffer(offer.id) } 44
45 Auto scaling applications case ResourceOffers(offers) => log.info("no need to scale, declining offers.") offers.foreach(offer => framework.driver.declineoffer(offer.id)) 45
46 Auto scaling applications case StatusUpdate(update) => update.status.state match { case TaskRunning => runningtasks += update.status.taskid case TaskKilled TaskError TaskFailed TaskFinished => runningtasks -= update.status.taskid case _ => } log.info(s"${update.status.taskid} is now ${update.status.state}") framework.driver.acknowledgestatusupdate(update) 46
47 Check out the projects
48 Launch a Mesosphere cluster on Google Compute Engine
49 Join the community
50 We are hiring!
51 Thanks for your attention
The Emergence of the Datacenter Developer. Tobi Knaup, Co-Founder & CTO at
The Emergence of the Datacenter Developer Tobi Knaup, Co-Founder & CTO at Mesosphere @superguenter A Brief History of Operating Systems 2 1950 s Mainframes Punchcards No operating systems Time Sharing
More informationScale your Docker containers with Mesos
Scale your Docker containers with Mesos Timothy Chen tim@mesosphere.io About me: - Distributed Systems Architect @ Mesosphere - Lead Containerization engineering - Apache Mesos, Drill PMC / Committer
More informationCONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS
APACHE MESOS NYC MEETUP SEPTEMBER 22, 2016 CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS WHO WE ARE ROGER IGNAZIO SUNIL SHAH Tech Lead at Mesosphere @rogerignazio Product Manager at Mesosphere @ssk2
More information@joerg_schad Nightmares of a Container Orchestration System
@joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1 Jörg Schad Distributed Systems Engineer @joerg_schad Jan Repnak Support Engineer/ Solution Architect
More informationOpenWhisk on Mesos. Tyson Norris/Dragos Dascalita Haut, Adobe Systems, Inc.
OpenWhisk on Mesos Tyson Norris/Dragos Dascalita Haut, Adobe Systems, Inc. OPENWHISK ON MESOS SERVERLESS BACKGROUND OPERATIONAL EVOLUTION SERVERLESS BACKGROUND CUSTOMER FOCUSED DEVELOPMENT SERVERLESS BACKGROUND
More informationMarathon & Metronome Mesosphere, Inc. All Rights Reserved. 1
Marathon & Metronome 2016 Mesosphere, Inc. All Rights Reserved. 1 About Marathon & Metronome Marathon Framework for long running services Metronome Framework for scheduled or one-off jobs 2016 Mesosphere,
More informationCONTINUOUS DELIVERY WITH DC/OS AND JENKINS
SOFTWARE ARCHITECTURE NOVEMBER 15, 2016 CONTINUOUS DELIVERY WITH DC/OS AND JENKINS AGENDA Presentation Introduction to Apache Mesos and DC/OS Components that make up modern infrastructure Running Jenkins
More informationDeploying Applications on DC/OS
Mesosphere Datacenter Operating System Deploying Applications on DC/OS Keith McClellan - Technical Lead, Federal Programs keith.mcclellan@mesosphere.com V6 THE FUTURE IS ALREADY HERE IT S JUST NOT EVENLY
More informationThe SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.
Dublin Apache Kafka Meetup, 30 August 2017 The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Joseph @pleia2 * ASF projects 1 Elizabeth K. Joseph, Developer Advocate Developer Advocate
More informationBig data systems 12/8/17
Big data systems 12/8/17 Today Basic architecture Two levels of scheduling Spark overview Basic architecture Cluster Manager Cluster Cluster Manager 64GB RAM 32 cores 64GB RAM 32 cores 64GB RAM 32 cores
More informationUsing DC/OS for Continuous Delivery
Using DC/OS for Continuous Delivery DevPulseCon 2017 Elizabeth K. Joseph, @pleia2 Mesosphere 1 Elizabeth K. Joseph, Developer Advocate, Mesosphere 15+ years working in open source communities 10+ years
More informationUsing the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver
Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data
More informationAdvantages of using DC/OS Azure infrastructure and the implementation architecture Bill of materials used to construct DC/OS and the ACS clusters
Reference implementation: The Azure Container Service DC/OS is a distributed operating system powered by Apache Mesos that treats collections of CPUs, RAM, networking and so on as a distributed kernel
More informationBuilding/Running Distributed Systems with Apache Mesos
Building/Running Distributed Systems with Apache Mesos Philly ETE April 8, 2015 Benjamin Hindman @benh $ whoami 2007-2012 2009-2010 - 2014 my other computer is a datacenter my other computer is a datacenter
More information利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC
利用 Mesos 打造高延展性 Container 環境 Frank, Microsoft MTC About Me Developer @ Yahoo! DevOps @ HTC Technical Architect @ MSFT Agenda About Docker Manage containers Apache Mesos Mesosphere DC/OS application = application
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationContainer 2.0. Container: check! But what about persistent data, big data or fast data?!
@unterstein @joerg_schad @dcos @jaxdevops Container 2.0 Container: check! But what about persistent data, big data or fast data?! 1 Jörg Schad Distributed Systems Engineer @joerg_schad Johannes Unterstein
More informationApache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source
Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC https://ignite.apache.org @apacheignite @dsetrakyan Agenda About In- Memory Computing Apache Ignite
More informationSunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS Mesosphere, Inc. All Rights Reserved.
Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS 1 Introduction MOBILE, SOCIAL & CLOUD ARE RAISING CUSTOMER EXPECTATIONS We need a way to deliver software so fast that our
More informationRedesigning Apache Flink s Distributed Architecture. Till
Redesigning Apache Flink s Distributed Architecture Till Rohrmann trohrmann@apache.org @stsffap 2 1001 Deployment Scenarios Many different deployment scenarios Yarn Mesos Docker/Kubernetes Standalone Etc.
More information1.0. Vinod Kone Anand Mazumdar
1.0 Vinod Kone (vinod@mesosphere.io) Anand Mazumdar (anand@mesosphere.io) MESOS 1.0 1.0 Released on July 27th 2016 Biggest release ever! 1.0.1 Released on Aug 24th 2016 Please use this one! WHAT S IN 1.0?
More informationBuilding a Data-Friendly Platform for a Data- Driven Future
Building a Data-Friendly Platform for a Data- Driven Future Benjamin Hindman - @benh 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami BENJAMIN HINDMAN Co-founder and Chief Architect of Mesosphere,
More informationBe a Microservices Hero ContainerCon 15
https://github.com/adobe-apiplatform Be a Microservices Hero ContainerCon 15 Dragos Dascalita Haut Adobe Presentation scripts: https://gist.github.com/ddragosd/608bf8d3d13e3f688874 A CreativeCloud Microservice
More information@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS
@unterstein @dcos @bedcon #bedcon Operating microservices with Apache Mesos and DC/OS 1 Johannes Unterstein Software Engineer @Mesosphere @unterstein @unterstein.mesosphere 2017 Mesosphere, Inc. All Rights
More informationAGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS
Sunil Shah AGILE DEVELOPMENT AND PAAS USING THE MESOSPHERE DCOS 1 THE DATACENTER OPERATING SYSTEM (DCOS) 2 DCOS INTRODUCTION The Mesosphere Datacenter Operating System (DCOS) is a distributed operating
More informationSpark Overview. Professor Sasu Tarkoma.
Spark Overview 2015 Professor Sasu Tarkoma www.cs.helsinki.fi Apache Spark Spark is a general-purpose computing framework for iterative tasks API is provided for Java, Scala and Python The model is based
More informationAn Introduction to Apache Spark
An Introduction to Apache Spark 1 History Developed in 2009 at UC Berkeley AMPLab. Open sourced in 2010. Spark becomes one of the largest big-data projects with more 400 contributors in 50+ organizations
More informationGoogle Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda
Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda Module 1: Google Cloud Platform Projects Identify project resources and quotas Explain the purpose of Google Cloud Resource
More informationMarathon has a timer metric that determines how long an event has taken place. Timer does not exist for Mesos observability metrics.
Performance Monitoring Here are some recommendations for monitoring a DC/OS cluster. You can use any monitoring tools. The endpoints listed below will help you troubleshoot when issues occur. Your monitoring
More informationZabbix on a Clouds. Another approach to a building a fault-resilient, scalable monitoring platform
Zabbix on a Clouds Another approach to a building a fault-resilient, scalable monitoring platform Preface 00:20:00 We will be discussing a few topics on how you will deploy or migrate Zabbix monitoring
More informationIntroduction to Mesos and the Datacenter Operating System
Introduction to Mesos and the Datacenter Operating System Artem Harutyunyan (artem@mesosphere.io) 2016 Mesosphere, Inc. All Rights Reserved. INTRO $ whoami ARTEM HARUTYUNYAN ALICE Offline (2004-2010) AliEn
More informationLecture 11 Hadoop & Spark
Lecture 11 Hadoop & Spark Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Outline Distributed File Systems Hadoop Ecosystem
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationMesosphere and the Enterprise: Run Your Applications on Apache Mesos. Steve Wong Open Source Engineer {code} by Dell
Mesosphere and the Enterprise: Run Your Applications on Apache Mesos Steve Wong Open Source Engineer {code} by Dell EMC @cantbewong Open source at Dell EMC {code} by Dell EMC is a group of passionate open
More informationPOWERING THE INTERNET WITH APACHE MESOS
Neil Conway, Niklas Nielsen, Greg Mann & Sunil Shah POWERING THE INTERNET WITH APACHE MESOS 1 MESOS: ORIGINS 2 THE BIRTH OF MESOS TWITTER TECH TALK APACHE INCUBATION The grad students working on Mesos
More informationScaling out with Akka Actors. J. Suereth
Scaling out with Akka Actors J. Suereth Agenda The problem Recap on what we have Setting up a Cluster Advanced Techniques Who am I? Author Scala In Depth, sbt in Action Typesafe Employee Big Nerd ScalaDays
More informationStream Processing on IoT Devices using Calvin Framework
Stream Processing on IoT Devices using Calvin Framework by Ameya Nayak A Project Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science Supervised
More informationSCALING LIKE TWITTER WITH APACHE MESOS
Philip Norman & Sunil Shah SCALING LIKE TWITTER WITH APACHE MESOS 1 MODERN INFRASTRUCTURE Dan the Datacenter Operator Alice the Application Developer Doesn t sleep very well Loves automation Wants to control
More informationScaling Up & Out. Haidar Osman
Scaling Up & Out Haidar Osman 1- Crash course in Scala - Classes - Objects 2- Actors - The Actor Model - Examples I, II, III, IV 3- Apache Spark - RDD & DAG - Word Count Example 2 1- Crash course in Scala
More informationMANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET
Roger Ignazio PuppetConf 2015 MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET 2015 Mesosphere, Inc. All Rights Reserved. 1 $(whoami) ABOUT ME Roger Ignazio Infrastructure Automation Engineer @ Mesosphere
More informationMESOS A State-Of-The-Art Container Orchestrator Mesosphere, Inc. All Rights Reserved. 1
MESOS A State-Of-The-Art Container Orchestrator 2016 Mesosphere, Inc. All Rights Reserved. 1 About me Jie Yu (@jie_yu) Tech Lead at Mesosphere Mesos PMC member and committer Formerly worked at Twitter
More informationUPGRADING A MESOS CLUSTER
MesosCon 2016 - Greg Mann UPGRADING A MESOS CLUSTER 2016 Mesosphere, Inc. All Rights Reserved. 1 Greg Mann Software Engineer Mesos contributor Computational chemist Croissant enthusiast @greggomann 2016
More informationAdvanced Continuous Delivery Strategies for Containerized Applications Using DC/OS
Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS ContainerCon @ Open Source Summit North America 2017 Elizabeth K. Joseph @pleia2 1 Elizabeth K. Joseph, Developer Advocate
More informationSpark, Shark and Spark Streaming Introduction
Spark, Shark and Spark Streaming Introduction Tushar Kale tusharkale@in.ibm.com June 2015 This Talk Introduction to Shark, Spark and Spark Streaming Architecture Deployment Methodology Performance References
More informationSAMPLE CHAPTER IN ACTION. Roger Ignazio. FOREWORD BY Florian Leibert MANNING
SAMPLE CHAPTER IN ACTION Roger Ignazio FOREWORD BY Florian Leibert MANNING Mesos in Action by Roger Ignazio Chapter 1 Copyright 2016 Manning Publications brief contents PART 1 HELLO, MESOS...1 1 Introducing
More informationMicroservices without the Servers: AWS Lambda in Action
Microservices without the Servers: AWS Lambda in Action Dr. Tim Wagner, General Manager AWS Lambda August 19, 2015 Seattle, WA 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Two
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationGoals Pragmukko features:
Pragmukko Pragmukko Pragmukko is an Akka-based framework for distributed computing. Originally, it was developed for building an IoT solution but it also fit well for many other cases. Pragmukko treats
More informationReactive App using Actor model & Apache Spark. Rahul Kumar Software
Reactive App using Actor model & Apache Spark Rahul Kumar Software Developer @rahul_kumar_aws About Sigmoid We build realtime & big data systems. OUR CUSTOMERS Agenda Big Data - Intro Distributed Application
More informationDeep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services
Deep Dive Amazon Kinesis Ian Meyers, Principal Solution Architect - Amazon Web Services Analytics Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationHow we built a highly scalable Machine Learning platform using Apache Mesos
How we built a highly scalable Machine Learning platform using Apache Mesos Daniel Sârbe Development Manager, BigData and Cloud Machine Translation @ SDL Co-founder of BigData/DataScience Meetup Cluj,
More informationBeyond MapReduce: Apache Spark Antonino Virgillito
Beyond MapReduce: Apache Spark Antonino Virgillito 1 Why Spark? Most of Machine Learning Algorithms are iterative because each iteration can improve the results With Disk based approach each iteration
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationFault Domains in Mesos. Vinod Kone
Fault Domains in Mesos Vinod Kone (vinodkone@apache.org) About me Apache Mesos PMC and Committer Engineering Manager for Mesos team @ Mesosphere Previously Tech Lead for Mesos team @ Twitter PhD in Computer
More informationIndex. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI /
Index A ACID, 251 Actor model Akka installation, 44 Akka logos, 41 OOP vs. actors, 42 43 thread-based concurrency, 42 Agents server, 140, 251 Aggregation techniques materialized views, 216 probabilistic
More informationBenchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms
Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Candidate Andrea Spina Advisor Prof. Sonia Bergamaschi Co-Advisor Dr. Tilmann Rabl Co-Advisor
More informationStorm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter
Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Basic info Open sourced September 19th Implementation is 15,000 lines of code Used by over 25 companies >2700 watchers on Github
More informationA Whirlwind Tour of Apache Mesos
A Whirlwind Tour of Apache Mesos About Herdy Senior Software Engineer at Citadel Technology Solutions (Singapore) The eternal student Find me on the internet: _hhandoko hhandoko hhandoko https://au.linkedin.com/in/herdyhandoko
More informationEvolution of an Apache Spark Architecture for Processing Game Data
Evolution of an Apache Spark Architecture for Processing Game Data Nick Afshartous WB Analytics Platform May 17 th 2017 May 17 th, 2017 About Me nafshartous@wbgames.com WB Analytics Core Platform Lead
More information2/26/2017. Originally developed at the University of California - Berkeley's AMPLab
Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second
More informationProcessing of big data with Apache Spark
Processing of big data with Apache Spark JavaSkop 18 Aleksandar Donevski AGENDA What is Apache Spark? Spark vs Hadoop MapReduce Application Requirements Example Architecture Application Challenges 2 WHAT
More informationDeploy an external load balancer with
Deploying an Internally and Externally Load Balanced App with Marathon-LB In this tutorial, Marathon-LB is used as an internal and external load balancer. The external load balancer is used to route external
More informationNote: Isolation guarantees among subnets depend on your firewall policies.
Virtual Networks DC/OS supports Container Networking Interface (CNI)-compatible virtual networking solutions, including Calico and Contrail. DC/OS also provides a native virtual networking solution called
More informationCloud & container monitoring , Lars Michelsen Check_MK Conference #4
Cloud & container monitoring 04.05.2018, Lars Michelsen Some cloud definitions Applications Data Runtime Middleware O/S Virtualization Servers Storage Networking Software-as-a-Service (SaaS) Applications
More informationBatch Processing Basic architecture
Batch Processing Basic architecture in big data systems COS 518: Distributed Systems Lecture 10 Andrew Or, Mike Freedman 2 1 2 64GB RAM 32 cores 64GB RAM 32 cores 64GB RAM 32 cores 64GB RAM 32 cores 3
More informationAccelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017
Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017 About the Presentation Problems Existing Solutions Denis Magda
More informationDistributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere
Distributed Data on Distributed Infrastructure Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere Kunal Kusoorkar Director Solutions Engineering, ArangoDB @neunhoef Jörg Schad Claudius
More informationStorm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter
Storm Distributed and fault-tolerant realtime computation Nathan Marz Twitter Storm at Twitter Twitter Web Analytics Before Storm Queues Workers Example (simplified) Example Workers schemify tweets and
More informationHow Container Schedulers and Software-based Storage will Change the Cloud
How Container Schedulers and Software-based Storage will Change the Cloud David vonthenen {code} by Dell EMC @dvonthenen http://dvonthenen.com github.com/dvonthenen Agenda Review of Software-based Storage
More informationAuto Management for Apache Kafka and Distributed Stateful System in General
Auto Management for Apache Kafka and Distributed Stateful System in General Jiangjie (Becket) Qin Data Infrastructure @LinkedIn GIAC 2017, 12/23/17@Shanghai Agenda Kafka introduction and terminologies
More informationThe Stream Processor as a Database. Ufuk
The Stream Processor as a Database Ufuk Celebi @iamuce Realtime Counts and Aggregates The (Classic) Use Case 2 (Real-)Time Series Statistics Stream of Events Real-time Statistics 3 The Architecture collect
More informationModern Stream Processing with Apache Flink
1 Modern Stream Processing with Apache Flink Till Rohrmann GOTO Berlin 2017 2 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager 3 What changes faster? Data
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationCS-580K/480K Advanced Topics in Cloud Computing. Container III
CS-580/480 Advanced Topics in Cloud Computing Container III 1 Docker Container https://www.docker.com/ Docker is a platform for developers and sysadmins to develop, deploy, and run applications with containers.
More information/ Cloud Computing. Recitation 5 September 26 th, 2017
15-319 / 15-619 Cloud Computing Recitation 5 September 26 th, 2017 1 Overview Administrative issues Office Hours, Piazza guidelines Last week s reflection Project 2.1, OLI Unit 2 modules 5 and 6 This week
More informationCSE 444: Database Internals. Lecture 23 Spark
CSE 444: Database Internals Lecture 23 Spark References Spark is an open source system from Berkeley Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei
More informationADVANCED DATABASES CIS 6930 Dr. Markus Schneider. Group 5 Ajantha Ramineni, Sahil Tiwari, Rishabh Jain, Shivang Gupta
ADVANCED DATABASES CIS 6930 Dr. Markus Schneider Group 5 Ajantha Ramineni, Sahil Tiwari, Rishabh Jain, Shivang Gupta WHAT IS ELASTIC SEARCH? Elastic Search Elasticsearch is a search engine based on Lucene.
More informationBig Data Infrastructures & Technologies
Big Data Infrastructures & Technologies Spark and MLLIB OVERVIEW OF SPARK What is Spark? Fast and expressive cluster computing system interoperable with Apache Hadoop Improves efficiency through: In-memory
More information[This is not an article, chapter, of conference paper!]
http://www.diva-portal.org [This is not an article, chapter, of conference paper!] Performance Comparison between Scaling of Virtual Machines and Containers using Cassandra NoSQL Database Sogand Shirinbab,
More informationMicrosoft Cloud Workshop. Containers and DevOps Hackathon Learner Guide
Microsoft Cloud Workshop Containers and DevOps Hackathon Learner Guide September 2017 2017 Microsoft Corporation. All rights reserved. This document is confidential and proprietary to Microsoft. Internal
More informationHAProxy configuration
Marathon LB Reference HAProxy configuration Marathon-LB works by automatically generating configuration for HAProxy and then reloading HAProxy as needed. Marathon-LB generates the HAProxy configuration
More informationPractical Considerations for Multi- Level Schedulers. Benjamin
Practical Considerations for Multi- Level Schedulers Benjamin Hindman @benh agenda 1 multi- level scheduling (scheduler activations) 2 intra- process multi- level scheduling (Lithe) 3 distributed multi-
More informationDeploy Like A Boss Oliver Nicholas
Deploy Like A Boss Oliver Nicholas DEPLOY LIKE A BOSS THE JOURNEY FROM 2 SERVERS TO 20,000 THE DEPLOYMENT PIPELINE MARCH 1, 2015 3 UBER TECHNOLOGIES, INC BUSINESS METRICS 311 Cities 57 Countries 1,000,000+
More informationECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective
ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective Part II: Data Center Software Architecture: Topic 3: Programming Models Piccolo: Building Fast, Distributed Programs
More informationHybrid MapReduce Workflow. Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US
Hybrid MapReduce Workflow Yang Ruan, Zhenhua Guo, Yuduo Zhou, Judy Qiu, Geoffrey Fox Indiana University, US Outline Introduction and Background MapReduce Iterative MapReduce Distributed Workflow Management
More informationFunctional Comparison and Performance Evaluation. Huafeng Wang Tianlun Zhang Wei Mao 2016/11/14
Functional Comparison and Performance Evaluation Huafeng Wang Tianlun Zhang Wei Mao 2016/11/14 Overview Streaming Core MISC Performance Benchmark Choose your weapon! 2 Continuous Streaming Micro-Batch
More informationRESILIENT DISTRIBUTED DATASETS: A FAULT-TOLERANT ABSTRACTION FOR IN-MEMORY CLUSTER COMPUTING
RESILIENT DISTRIBUTED DATASETS: A FAULT-TOLERANT ABSTRACTION FOR IN-MEMORY CLUSTER COMPUTING Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin,
More informationDocker & Mesos/Marathon in production at OVH. Balthazar Rouberol https://ovh.to/6brrkan
Docker & Mesos/Marathon in production at OVH Balthazar Rouberol https://ovh.to/6brrkan 1 About Docker at OVH 2014-2015: Home-made container orchestrator, Sailabove, based on LXC 2016: Switch to Docker
More informationContainers and the Evolution of Computing
Containers and the Evolution of Computing Matt Nowina Solutions Architect 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Scaling Applications Order UI User UI Shipping UI Order
More informationApache Spark 2.0. Matei
Apache Spark 2.0 Matei Zaharia @matei_zaharia What is Apache Spark? Open source data processing engine for clusters Generalizes MapReduce model Rich set of APIs and libraries In Scala, Java, Python and
More informationUnderstanding and Evaluating Kubernetes. Haseeb Tariq Anubhavnidhi Archie Abhashkumar
Understanding and Evaluating Kubernetes Haseeb Tariq Anubhavnidhi Archie Abhashkumar Agenda Overview of project Kubernetes background and overview Experiments Summary and Conclusion 1. Overview of Project
More informationPutting A Spark in Web Apps
Putting A Spark in Web Apps Apache Big Data, Seville ES, 2016 David Fallside Intro Web app development has moved from Java etc to Node.js and JavaScript JS environment relatively simple and very rich,
More informationApache Ignite and Apache Spark Where Fast Data Meets the IoT
Apache Ignite and Apache Spark Where Fast Data Meets the IoT Denis Magda GridGain Product Manager Apache Ignite PMC http://ignite.apache.org #apacheignite #denismagda Agenda IoT Demands to Software IoT
More informationStreaming data Model is opposite Queries are usually fixed and data are flows through the system.
1 2 3 Main difference is: Static Data Model (For related database or Hadoop) Data is stored, and we just send some query. Streaming data Model is opposite Queries are usually fixed and data are flows through
More informationLambda Architecture with Apache Spark
Lambda Architecture with Apache Spark Michael Hausenblas, Chief Data Engineer MapR First Galway Data Meetup, 2015-02-03 2015 MapR Technologies 2015 MapR Technologies 1 Polyglot Processing 2015 2014 MapR
More informationContainer Pods with Docker Compose in Apache Mesos
Container Pods with Docker Compose in Apache Mesos 1 Summary Goals: 1. Treating Apache Mesos and docker as first class citizens, the platform needs to seamlessly run and scale docker container pods in
More informationAn Introduction to Big Data Analysis using Spark
An Introduction to Big Data Analysis using Spark Mohamad Jaber American University of Beirut - Faculty of Arts & Sciences - Department of Computer Science May 17, 2017 Mohamad Jaber (AUB) Spark May 17,
More informationReal-time personal trainer on the SMACK Jan Anirvan Chakraborty
Real-time personal trainer on the SMACK stack @honzam399 Jan Machacek @anirvan_c Anirvan Chakraborty Automated personal trainer - muvr Suggests the sequence of exercise sessions Suggests exercises in a
More informationContainer Orchestration on Amazon Web Services. Arun
Container Orchestration on Amazon Web Services Arun Gupta, @arungupta Docker Workflow Development using Docker Docker Community Edition Docker for Mac/Windows/Linux Monthly edge and quarterly stable
More informationFROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS
FROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS Carlos Sanchez @csanchez Watch online at carlossg.github.io/presentations ABOUT ME Senior So ware Engineer @ CloudBees Author of Jenkins Kubernetes plugin
More information