Scaling Up from 1000 to 10 Nodes

Size: px
Start display at page:

Download "Scaling Up from 1000 to 10 Nodes"

Transcription

1 Scaling Up from 1000 to 10 Nodes Anton Lavrik Alert Logic, Inc. 1

2 Alert Logic vs Twitter: daily numbers (2012) Alert Logic Twitter Events received 11.5B 0.5B Volume received 5TB 70GB Logs produced? 100TB 2

3 Alert Logic vs Twitter: daily numbers (2012) Alert Logic Twitter Events received 11.5B 0.5B Volume received Logs produced 5TB 10GB 70GB 100TB 3

4 Log collection: before rewrite hosts H H H H H H H x 100K appliances A A A A A x 1000 backend Alert Logic WAN users 4

5 Log collection: goal hosts & appliances + agents H A' H' H H H H H H' A' H' A' H' H' H' H' H' H' H A' H' x 1M+ backend Alert Logic users 5

6 Log collection: goal hosts & appliances + agents H A' H' H H H H H H' A' H' A' H' H' H' H' H' H' H A' H' x 1M+ new backend Alert Logic users 6

7 Log collection: zooming in H' A' H' A' H' H' H' H' H' H' Collection Storage Analysis Web UI Search Reports Alerts 7

8 Log collection: zooming in C C A' C H' C C C clients appliance agent Load Balancing Collection Controller Collection Controller Collection Controller Persistent Queue 8

9 Log collection: zooming in clients C C C C x 100K+ LB Collection Controller Collection Controller Collection Controller 3 Persistent Queue 9

10 Log collection: zooming in clients C C C C x 100K+ HTTPS LB sticky sessions Collection Controller Collection Controller Collection Controller 3 Persistent Queue 10

11 Log collection: zooming in clients C C C C x 100K+ HTTPS LB sticky sessions Collection Controller Collection Controller Collection Controller 3 Persistent Queue 11

12 Log collection: before rewrite hosts H H H H H H H x 100K appliances A A A A A x 1000 backend Alert Logic WAN users 12

13 Log collection: zooming in clients C C C x 40K collection controller node CC CC CC controller Erlang processes x 40K Persistent Queue 13

14 Log collection: zooming in clients C C C x 40K collection controller node CC CC CC controller Erlang processes x 40K Stats Configs DBs... Persistent Queue 14

15 What changes when you scale up from 1000 to 10 nodes? 15

16 Everything! 16

17 Increases per node/cpu core request rate data volume number of open connections number of processes per VM failure rate bugs and memory leaks get exposed 17

18 Colocated environment different clients installed in different environments create different load patterns one set of clients can affect health of others can t easily upgrade/troubleshoot individual clients no downtime tolerance 18

19 Application monitoring logging is less useful for troubleshooting need real-time monitoring, dashboards need visualization 19

20 Problem: high memory usage leading to OOM kill 20

21 Hint: many processes each dealing with lots of data clients C C C x 40K collection controller node CC CC CC controller Erlang processes x 40K Persistent Queue 21

22 High memory usage: force garbage collection garbage_collect(p) 22

23 Problem: memory leak 23

24 Memory leak: searching... some luck [ garbage_collect(p) P <- processes() ]. 24

25 Memory leak: troubleshooting... [{M, P, process_info(p, [registered_name, initial_call, current_function, dictionary]), B} {P, M, B} <- lists:sublist(lists:reverse(lists:keysort(2, [case process_info(p, binary) of {_, Bins} -> SortedBins = lists:usort(bins), {_, Sizes, _} = lists:unzip3(sortedbins), {P, lists:sum(sizes), SortedBins}; _ - > {P, 0, []} end P <- processes()])), 5)]. 25

26 Memory leak: troubleshooting... [ {M, P, process_info(p, [registered_name, initial_call, current_function, dictionary]), B} {P, M, B} <- lists:sublist(lists:reverse(lists:keysort(2, [ case process_info(p, binary) of {_, Bins} -> SortedBins = lists:usort(bins), {_, Sizes, _} = lists:unzip3(sortedbins), {P, lists:sum(sizes), []}; _ -> {P, 0, []} end P <- processes() ] )), 5) ]. 26

27 Memory leak: searching... [{ ,< >, [{registered_name,alcollect_client_sup}, {initial_call,{proc_lib,init_p,5}}, {current_function,{gen_server,loop,6}}, {dictionary,[{'$ancestors',[alcollect_sup,lmcollect_sup, < >]}, {'$initial_call', {supervisor,alcollect_client_sup,1}}]}], []}, { ,< >, [{registered_name,gproc}, {initial_call,{proc_lib,init_p,5}}, {current_function,{gen_server,loop,6}}, {dictionary,[{'$ancestors',[gproc_sup,< >]}, {'$initial_call',{gproc,init,1}}]}], []}, 27

28 Memory leak: searching... victory! supervisor:terminate_child(lmcollect_sup, lmcollect_ls_config). application:stop(lmcollect)

29 Memory leak: sub-binary references process heaps ets tables message queues driver queues 29

30 Memory leak: preventing binary references 1. binary:copy(x) 2. -define(copy_binaries(x), binary_to_term(term_to_binary(x))). 30

31 Problem: supervisors shutdown after children exceed max restart rate 31

32 Preventing supervisor shutdowns Delay child crashes: 2 options 1. supervisor2 2. manually insert delays for each gen_server s Mod:handle_X handle_call() -> try do_handle_call() catch Class:Reason -> timer:sleep(1000), exit({crash, Class, Reason, get_stracktrace()}) end 32

33 Problem: stall while using gen_udp 33

34 Problem: stall while using gen_udp 34

35 usr/erlang/erts-5.8.5/src/prim_inet.erl send(s, Data, OptList) when is_port(s), is_list(optlist) ->?DBG_FORMAT("prim_inet:send(~p, ~p)~n", [S,Data]), try erlang:port_command(s, Data, OptList) of false -> % Port busy and nosuspend option passed?dbg_format("prim_inet:send() -> {error,busy}~n", []), {error,busy}; true -> receive {inet_reply,s,status} ->?DBG_FORMAT("prim_inet:send() -> ~p~n", [Status]), Status end catch error:_error ->?DBG_FORMAT("prim_inet:send() -> {error,einval}~n", []), {error,einval} end. 35

36 Preventing gen_server stall when using gen_udp 1. gen_server2 2. asynchronous gen_udp:send - gen_udp:send(socket, Address, Port, SyslogMessage); + try erlang:port_command(socket, + [[((Port) bsr 8) band 16#ff, + (Port) band 16#ff], + [A band 16#ff, B band 16#ff, + C band 16#ff, D band 16#ff], + SyslogMessage])of + true -> ok + catch... +handle_info({inet_reply, _, ok}, State) -> + {noreply, State}; 3. fix gen_udp! 36

37 Problem: client-side latencies: 50%-ile 37

38 Client-side latencies: 90%-ile 38

39 Client-side latencies: 99%-ile 39

40 Client-side latencies: max 40

41 Summary 41

42 Erlang is a very efficient tool* 42

43 Erlang is a very efficient tool* * you need to know your tools 43

44 Excellent runtime for running highly concurrent applications + soft real-time Simple and powerful language Fault-tolerance Stable Rich troubleshooting capabilities Growing community 44

45 A dream: Erlang/OSP (Open Server Platform) 45

46 revise OTP establish OSP principles and building blocks packaging and deployment: e.g. get rid of reltool.config logging out of the box: standard error, syslog (including sasl) http client that works well clients for popular databases: MySQL, Postgres alternative distributed Erlang: cluster membership, leader election, distributed locking 46

47 Questions? Thank you 47

48 Optional: if there s enough time and interest 48

49 Problem: httpc - Erlang/OTP HTTP client 49

50 httpc bug: mixes up responses Server responds with 200: Oct 23 02:20:36 lm-scale-api-01 al-hostmeta[17814]: ALH00002I "GET /host_metadata/1710/ e1924d5bfc58ac9576d6b31f261e0d137d56e142 HTTP/1.1" performed in 4 ms Client magically receives 204: Oct 23 02:20:36 lm-scale-lmcollect-02 al-lmcollect[9436]: ALL00000T al_httpc:61: http get e1924d5bfc58ac9576d6b31f261e0d137d56e142 performed in ms Oct 23 02:20:36 lm-scale-lmcollect-02 al-lmcollect[9436]: AHM00001E Error querying host metadata for key 1710, e1924d5bfc58ac9576d6b31f261e0d137d56e142: {unknown_http_response, {{"HTTP/1.1",204,"No Content"},[{"date","Tue, 23 Oct :20:36 GMT"},{"server","MochiWeb/1.0 (Any of you quaids got a smint?)"}, {"content-length","0"}],<<>>}} 50

51 httpc bug: mixes up responses Server responds with 204: Oct 23 02:20:38 lm-scale-api-01 al-hostmeta[17814]: ALH00002I "PUT /host_metadata/1712/ fd09bc458d5ee0a250668cf9bf41b0bea HTTP/1.1" performed in 4 ms Client receives 404: Oct 23 02:20:38 lm-scale-lmcollect-02 al-lmcollect[9436]: ALL00000T al_httpc:86: http put fd09bc458d5ee0a250668cf9bf41b0bea performed in ms Oct 23 02:20:38 lm-scale-lmcollect-02 al-lmcollect[9436]: AHM00000E Error updating host metadata for key 1712, fd09bc458d5ee0a250668cf9bf41b0bea : {unknown_http_response, {404,<<"No metadata for key">>}} 51

Performance Optimization 101. Louis-Philippe Gauthier Team AdGear Trader

Performance Optimization 101. Louis-Philippe Gauthier Team AdGear Trader Performance Optimization 101 Louis-Philippe Gauthier Team leader @ AdGear Trader Exercise HTTP API server API GET /date - returns today s date GET /time - returns the unix time in seconds HTTP API server

More information

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR Petri Kero CTO / Ministry of Games MOBILE GAME BACKEND CHALLENGES Lots of concurrent users Complex interactions between players Persistent world with frequent

More information

Building Scalable Stateful Services. Craft Conf 2016

Building Scalable Stateful Services. Craft Conf 2016 Building Scalable Stateful s Craft Conf 2016 Caitie McCaffrey Distributed Systems Engineer @caitie caitiem.com Stateless s Stateless s Stateless s Stateless s Stateless s Stateless s Stateless s Stateless

More information

HA solution with PXC-5.7 with ProxySQL. Ramesh Sivaraman Krunal Bauskar

HA solution with PXC-5.7 with ProxySQL. Ramesh Sivaraman Krunal Bauskar HA solution with PXC-5.7 with ProxySQL Ramesh Sivaraman Krunal Bauskar Agenda What is Good HA eco-system? Understanding PXC-5.7 Understanding ProxySQL PXC + ProxySQL = Complete HA solution Monitoring using

More information

Introduction Storage Processing Monitoring Review. Scaling at Showyou. Operations. September 26, 2011

Introduction Storage Processing Monitoring Review. Scaling at Showyou. Operations. September 26, 2011 Scaling at Showyou Operations September 26, 2011 I m Kyle Kingsbury Handle aphyr Code http://github.com/aphyr Email kyle@remixation.com Focus Backend, API, ops What the hell is Showyou? Nontrivial complexity

More information

Percona XtraDB Cluster ProxySQL. For your high availability and clustering needs

Percona XtraDB Cluster ProxySQL. For your high availability and clustering needs Percona XtraDB Cluster-5.7 + ProxySQL For your high availability and clustering needs Ramesh Sivaraman Krunal Bauskar Agenda What is Good HA eco-system? Understanding PXC-5.7 Understanding ProxySQL PXC

More information

FRANCESCO CESARINI. presents ERLANG/OTP. Francesco Cesarini Erlang

FRANCESCO CESARINI. presents ERLANG/OTP. Francesco Cesarini Erlang FRANCESCO CESARINI presents Francesco Cesarini Erlang Solutions ERLANG/OTP @FrancescoC francesco@erlang-solutions.com www.erlang-solutions.com WHAT IS SCALABILITY? WHAT IS (MASSIVE) CONCURRENCY? WHAT

More information

ZooKeeper. Wait-free coordination for Internet-scale systems

ZooKeeper. Wait-free coordination for Internet-scale systems ZooKeeper Wait-free coordination for Internet-scale systems Patrick Hunt and Mahadev (Yahoo! Grid) Flavio Junqueira and Benjamin Reed (Yahoo! Research) Internet-scale Challenges Lots of servers, users,

More information

MySQL As A Service. Operationalizing 19 Years of Infrastructure at GoDaddy

MySQL As A Service. Operationalizing 19 Years of Infrastructure at GoDaddy MySQL As A Service Operationalizing 19 Years of Infrastructure at GoDaddy WHOAMI Nathan Northcutt Senior Software Engineer MySQL DevOps ~10 years performance engineering & distributed data services. Email:

More information

CS October 2017

CS October 2017 Atomic Transactions Transaction An operation composed of a number of discrete steps. Distributed Systems 11. Distributed Commit Protocols All the steps must be completed for the transaction to be committed.

More information

Aurora, RDS, or On-Prem, Which is right for you

Aurora, RDS, or On-Prem, Which is right for you Aurora, RDS, or On-Prem, Which is right for you Kathy Gibbs Database Specialist TAM Katgibbs@amazon.com Santa Clara, California April 23th 25th, 2018 Agenda RDS Aurora EC2 On-Premise Wrap-up/Recommendation

More information

Achieving Scalability and High Availability for clustered Web Services using Apache Synapse. Ruwan Linton WSO2 Inc.

Achieving Scalability and High Availability for clustered Web Services using Apache Synapse. Ruwan Linton WSO2 Inc. Achieving Scalability and High Availability for clustered Web Services using Apache Synapse Ruwan Linton [ruwan@apache.org] WSO2 Inc. Contents Introduction Apache Synapse Web services clustering Scalability/Availability

More information

TECHED USER CONFERENCE MAY 3-4, 2016

TECHED USER CONFERENCE MAY 3-4, 2016 TECHED USER CONFERENCE MAY 3-4, 2016 Bob Jeffcott Software AG Big Data Adabas In Memory Data Management with Terracotta 2016 Software AG. All rights reserved. For internal use only AGENDA 1. ADABAS/NATURAL

More information

C. Collect engine heap performance data via the Cisco Unified Real-Time Monitoring Tool.

C. Collect engine heap performance data via the Cisco Unified Real-Time Monitoring Tool. Volume: 50 Questions Question No: 1 In a high availability over WAN deployment, which option cannot be located across the WAN from the active Cisco Unified Contact Center Express site? A. SMTP server B.

More information

How can you manage what you can t see?

How can you manage what you can t see? How can you manage what you can t see? Know what you have with Panda Cloud Systems Management Business challenge: You can t manage it if you don t know it exists. Do you have 100% permanent visibility

More information

Building a Scalable Architecture for Web Apps - Part I (Lessons Directi)

Building a Scalable Architecture for Web Apps - Part I (Lessons Directi) Intelligent People. Uncommon Ideas. Building a Scalable Architecture for Web Apps - Part I (Lessons Learned @ Directi) By Bhavin Turakhia CEO, Directi (http://www.directi.com http://wiki.directi.com http://careers.directi.com)

More information

Hi! NET Developer Group Braunschweig!

Hi! NET Developer Group Braunschweig! Hi! NET Developer Group Braunschweig! Über Tobias Dipl. Informatiker (FH) Passionated Software Developer Clean Code Developer.NET Junkie.NET User Group Lead Microsoft PFE Software Development Twitter @Blubern

More information

Load Balancing For Clustered Barracuda CloudGen WAF Instances in the New Microsoft Azure Management Portal

Load Balancing For Clustered Barracuda CloudGen WAF Instances in the New Microsoft Azure Management Portal Load Balancing For Clustered Barracuda CloudGen WAF Instances in the New Microsoft Azure Management This guide will walk you through the steps to load balance traffic across multiple instances of the Barracuda

More information

Oracle WebCenter Portal Performance Tuning

Oracle WebCenter Portal Performance Tuning ORACLE PRODUCT LOGO Oracle WebCenter Portal Performance Tuning Rich Nessel - Principal Product Manager Christina Kolotouros - Product Management Director 1 Copyright 2011, Oracle and/or its affiliates.

More information

Application Management Webinar. Daniela Field

Application Management Webinar. Daniela Field Application Management Webinar Daniela Field Agenda } Agile Deployment } Project vs Node Security } Deployment } Cloud Administration } Monitoring } Logging } Alerting Cloud Overview Cloud Overview Project

More information

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers A Distributed System Case Study: Apache Kafka High throughput messaging for diverse consumers As always, this is not a tutorial Some of the concepts may no longer be part of the current system or implemented

More information

Table of Contents HOL-SDC-1317

Table of Contents HOL-SDC-1317 Table of Contents Lab Overview - Components... 2 Business Critical Applications - About this Lab... 3 Infrastructure Components - VMware vcenter... 5 Infrastructure Components - VMware ESXi hosts... 6

More information

Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda

Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda Module 1: Google Cloud Platform Projects Identify project resources and quotas Explain the purpose of Google Cloud Resource

More information

Erlang. Types, Abstract Form & Core. Salvador Tamarit Muñoz. Universitat Politècnica de València

Erlang. Types, Abstract Form & Core. Salvador Tamarit Muñoz. Universitat Politècnica de València Erlang Types, Abstract Form & Core Salvador Tamarit Muñoz Universitat Politècnica de València Contents 1 Introduction Motivation 2 Concurrent Erlang 3 Further reading Introduction Introduction Erlang is

More information

This tutorial will give you a quick start with Consul and make you comfortable with its various components.

This tutorial will give you a quick start with Consul and make you comfortable with its various components. About the Tutorial Consul is an important service discovery tool in the world of Devops. This tutorial covers in-depth working knowledge of Consul, its setup and deployment. This tutorial aims to help

More information

Professional Architect

Professional Architect Professional Architect Core Competencies: Overview * At Dell Boomi, we want to equip our customers for mastery of the AtomSphere platform and their runtime environments. Our certified Professional Architects

More information

Atrium Webinar- What's new in ADDM Version 10

Atrium Webinar- What's new in ADDM Version 10 Atrium Webinar- What's new in ADDM Version 10 This document provides question and answers discussed during following webinar session: Atrium Webinar- What's new in ADDM Version 10 on May 8th, 2014 Q: Hi,

More information

RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP

RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP RADU POPESCU IMPROVING THE WRITE SCALABILITY OF THE CERNVM FILE SYSTEM WITH ERLANG/OTP THE EUROPEAN ORGANISATION FOR PARTICLE PHYSICS RESEARCH (CERN) 2 THE LARGE HADRON COLLIDER THE LARGE HADRON COLLIDER

More information

CS November 2017

CS November 2017 Bigtable Highly available distributed storage Distributed Systems 18. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 9 May, Copyright 2017 MarkLogic Corporation. All rights reserved. Monitoring MarkLogic Guide 1 MarkLogic 9 May, 2017 Last Revised: 9.0-2, July, 2017 Copyright 2017 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Monitoring MarkLogic Guide

More information

CloudI Integration Framework. Chicago Erlang User Group May 27, 2015

CloudI Integration Framework. Chicago Erlang User Group May 27, 2015 CloudI Integration Framework Chicago Erlang User Group May 27, 2015 Speaker Bio Bruce Kissinger is an Architect with Impact Software LLC. Linkedin: https://www.linkedin.com/pub/bruce-kissinger/1/6b1/38

More information

Virtualization And High Availability. Howard Chow Microsoft MVP

Virtualization And High Availability. Howard Chow Microsoft MVP Virtualization And High Availability Howard Chow Microsoft MVP Session Objectives And Agenda Virtualization and High Availability Types of high availability enabled by virtualization Enabling a highly

More information

PrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps

PrepAwayExam.   High-efficient Exam Materials are the best high pass-rate Exam Dumps PrepAwayExam http://www.prepawayexam.com/ High-efficient Exam Materials are the best high pass-rate Exam Dumps Exam : SAA-C01 Title : AWS Certified Solutions Architect - Associate (Released February 2018)

More information

Design and Architecture. Derek Collison

Design and Architecture. Derek Collison Design and Architecture Derek Collison What is Cloud Foundry? 2 The Open Platform as a Service 3 4 What is PaaS? Or more specifically, apaas? 5 apaas Application Platform as a Service Applications and

More information

Windows Azure Services - At Different Levels

Windows Azure Services - At Different Levels Windows Azure Windows Azure Services - At Different Levels SaaS eg : MS Office 365 Paas eg : Azure SQL Database, Azure websites, Azure Content Delivery Network (CDN), Azure BizTalk Services, and Azure

More information

Marathon has a timer metric that determines how long an event has taken place. Timer does not exist for Mesos observability metrics.

Marathon has a timer metric that determines how long an event has taken place. Timer does not exist for Mesos observability metrics. Performance Monitoring Here are some recommendations for monitoring a DC/OS cluster. You can use any monitoring tools. The endpoints listed below will help you troubleshoot when issues occur. Your monitoring

More information

Erlang 101. Google Doc

Erlang 101. Google Doc Erlang 101 Google Doc Erlang? with buzzwords Erlang is a functional concurrency-oriented language with extremely low-weight userspace "processes", share-nothing messagepassing semantics, built-in distribution,

More information

Open Source Database Performance Optimization and Monitoring with PMM. Fernando Laudares, Vinicius Grippa, Michael Coburn Percona

Open Source Database Performance Optimization and Monitoring with PMM. Fernando Laudares, Vinicius Grippa, Michael Coburn Percona Open Source Database Performance Optimization and Monitoring with PMM Fernando Laudares, Vinicius Grippa, Michael Coburn Percona Fernando Laudares 2 Vinicius Grippa 3 Michael Coburn Product Manager for

More information

Wasser drauf, umrühren, fertig?

Wasser drauf, umrühren, fertig? Wasser drauf, umrühren, fertig? Steffen Miller Principal Sales Consultant Agenda Motivation Was ist ein WebLogic Cluster? Cluster Konzepte Q & A WLS HA Focus Areas Data Failure Human

More information

VMware vrealize Log Insight Getting Started Guide

VMware vrealize Log Insight Getting Started Guide VMware vrealize Log Insight Getting Started Guide vrealize Log Insight 2.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced

More information

RabbitMQ Overview. Tony Garnock-Jones

RabbitMQ Overview. Tony Garnock-Jones RabbitMQ Overview Tony Garnock-Jones Agenda AMQP in 3 minutes RabbitMQ architecture Availability, Clustering, Federation Durability, Persistence, Memory usage Security Operational Tools

More information

McAfee Network Security Platform 8.3

McAfee Network Security Platform 8.3 8.3.7.44-8.3.7.14 Manager-Virtual IPS Release Notes McAfee Network Security Platform 8.3 Revision A Contents About this release New features Enhancements Resolved issues Installation instructions Known

More information

VMware vrealize Operations for Horizon Administration. 20 SEP 2018 VMware vrealize Operations for Horizon 6.6

VMware vrealize Operations for Horizon Administration. 20 SEP 2018 VMware vrealize Operations for Horizon 6.6 VMware vrealize Operations for Horizon Administration 20 SEP 2018 VMware vrealize Operations for Horizon 6.6 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/

More information

Distributed Systems. 29. Distributed Caching Paul Krzyzanowski. Rutgers University. Fall 2014

Distributed Systems. 29. Distributed Caching Paul Krzyzanowski. Rutgers University. Fall 2014 Distributed Systems 29. Distributed Caching Paul Krzyzanowski Rutgers University Fall 2014 December 5, 2014 2013 Paul Krzyzanowski 1 Caching Purpose of a cache Temporary storage to increase data access

More information

VMware vrealize Operations for Horizon Administration. Modified on 3 JUL 2018 VMware vrealize Operations for Horizon 6.4

VMware vrealize Operations for Horizon Administration. Modified on 3 JUL 2018 VMware vrealize Operations for Horizon 6.4 VMware vrealize Operations for Horizon Administration Modified on 3 JUL 2018 VMware vrealize Operations for Horizon 6.4 You can find the most up-to-date technical documentation on the VMware website at:

More information

Architekturen für die Cloud

Architekturen für die Cloud Architekturen für die Cloud Eberhard Wolff Architecture & Technology Manager adesso AG 08.06.11 What is Cloud? National Institute for Standards and Technology (NIST) Definition On-demand self-service >

More information

jetnexus Virtual Load Balancer

jetnexus Virtual Load Balancer jetnexus Virtual Load Balancer Mitigate the Risk of Downtime and Optimise Application Delivery We were looking for a robust yet easy to use solution that would fit in with our virtualisation policy and

More information

Polarion 18.2 Enterprise Setup

Polarion 18.2 Enterprise Setup SIEMENS Polarion 18.2 Enterprise Setup POL005 18.2 Contents Overview........................................................... 1-1 Terminology..........................................................

More information

the road to cloud native applications Fabien Hermenier

the road to cloud native applications Fabien Hermenier the road to cloud native applications Fabien Hermenier 1 cloud ready applications single-tiered monolithic hardware specific cloud native applications leverage cloud services scalable reliable 2 Agenda

More information

McAfee Network Security Platform 8.3

McAfee Network Security Platform 8.3 8.3.7.28-8.3.7.6 Manager-Virtual IPS Release Notes McAfee Network Security Platform 8.3 Revision B Contents About this release New features Enhancements Resolved issues Installation instructions Known

More information

jetnexus Virtual Load Balancer

jetnexus Virtual Load Balancer jetnexus Virtual Load Balancer Mitigate the Risk of Downtime and Optimise Application Delivery We were looking for a robust yet easy to use solution that would fit in with our virtualisation policy and

More information

Intuitive distributed algorithms. with F#

Intuitive distributed algorithms. with F# Intuitive distributed algorithms with F# Natallia Dzenisenka Alena Hall @nata_dzen @lenadroid A tour of a variety of intuitivedistributed algorithms used in practical distributed systems. and how to prototype

More information

Erlang: distributed programming

Erlang: distributed programming Erlang: distributed May 11, 2012 1 / 21 Fault tolerance in Erlang links, exit signals, system process in Erlang OTP Open Telecom Platform 2 / 21 General idea Links Exit signals System processes Summary

More information

GlassFish v2.1 & Enterprise Manager. Alexis Moussine-Pouchkine Sun Microsystems

GlassFish v2.1 & Enterprise Manager. Alexis Moussine-Pouchkine Sun Microsystems GlassFish v2.1 & Enterprise Manager Alexis Moussine-Pouchkine Sun Microsystems 1 Some vocabulary Cluster a group a homogenous GlassFish instances administered as a whole Load-Balancing a strategy and implementation

More information

All you need is fun. Cons T Åhs Keeper of The Code

All you need is fun. Cons T Åhs Keeper of The Code All you need is fun Cons T Åhs Keeper of The Code cons@klarna.com Cons T Åhs Keeper of The Code at klarna Architecture - The Big Picture Development - getting ideas to work Code Quality - care about the

More information

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5. Question 1 What makes a message unstable? How does an unstable message become stable? Distributed Systems 2016 Exam 2 Review Paul Krzyzanowski Rutgers University Fall 2016 In virtual sychrony, a message

More information

Real-time Monitoring, Inventory and Change Tracking for. Track. Report. RESOLVE!

Real-time Monitoring, Inventory and Change Tracking for. Track. Report. RESOLVE! Real-time Monitoring, Inventory and Change Tracking for Track. Report. RESOLVE! Powerful Monitoring Tool for Full Visibility over Your Hyper-V Environment VirtualMetric provides the most comprehensive

More information

Google File System. Arun Sundaram Operating Systems

Google File System. Arun Sundaram Operating Systems Arun Sundaram Operating Systems 1 Assumptions GFS built with commodity hardware GFS stores a modest number of large files A few million files, each typically 100MB or larger (Multi-GB files are common)

More information

Loosely coupled: asynchronous processing, decoupling of tiers/components Fan-out the application tiers to support the workload Use cache for data and content Reduce number of requests if possible Batch

More information

CS November 2018

CS November 2018 Bigtable Highly available distributed storage Distributed Systems 19. Bigtable Built with semi-structured data in mind URLs: content, metadata, links, anchors, page rank User data: preferences, account

More information

Architecting for the.

Architecting for the. Architecting for the Cloud @axelfontaine About Axel Fontaine Founder and CEO of Boxfuse Over 15 years industry experience Continuous Delivery expert Regular speaker at tech conferences JavaOne RockStar

More information

AppSense DataNow. Release Notes (Version 4.1) Components in this Release. These release notes include:

AppSense DataNow. Release Notes (Version 4.1) Components in this Release. These release notes include: AppSense DataNow Release Notes (Version 4.1) These release notes include: Components in this Release Important Upgrade Information New Features Bugs Fixed Known Issues and Limitations Supported Operating

More information

Intra-cluster Replication for Apache Kafka. Jun Rao

Intra-cluster Replication for Apache Kafka. Jun Rao Intra-cluster Replication for Apache Kafka Jun Rao About myself Engineer at LinkedIn since 2010 Worked on Apache Kafka and Cassandra Database researcher at IBM Outline Overview of Kafka Kafka architecture

More information

Building Cloud Infrastructure

Building Cloud Infrastructure Building Cloud Infrastructure Aaron Davidson CS 349D Who am I? - Early Databricks engineer (4 years) - Apache Spark committer & PMC member - Worked on a lot of things @ DB - Most recently, cloud infrastructure

More information

Pivotal Greenplum Command Center

Pivotal Greenplum Command Center Pivotal Greenplum Command Center Version 4.5.0 User Guide Rev: 01 2018 Pivotal Software, Inc. Table of Contents Table of Contents Pivotal Greenplum Command Center 4.5.0 Documentation Pivotal Greenplum

More information

Preventing and Resolving MySQL Downtime. Jervin Real, Michael Coburn Percona

Preventing and Resolving MySQL Downtime. Jervin Real, Michael Coburn Percona Preventing and Resolving MySQL Downtime Jervin Real, Michael Coburn Percona About Us Jervin Real, Technical Services Manager Engineer Engineering Engineers APAC Michael Coburn, Principal Technical Account

More information

BMC Configuration Management (Marimba) Best Practices and Troubleshooting. Andy Santosa Senior Technical Support Analyst

BMC Configuration Management (Marimba) Best Practices and Troubleshooting. Andy Santosa Senior Technical Support Analyst BMC Configuration Management (Marimba) Best Practices and Troubleshooting Andy Santosa Senior Technical Support Analyst 9/3/2006 Agenda CM Infrastructure CM Inventory CM Subscription CM Software Distribution

More information

Streaming data Model is opposite Queries are usually fixed and data are flows through the system.

Streaming data Model is opposite Queries are usually fixed and data are flows through the system. 1 2 3 Main difference is: Static Data Model (For related database or Hadoop) Data is stored, and we just send some query. Streaming data Model is opposite Queries are usually fixed and data are flows through

More information

McAfee Network Security Platform 8.3

McAfee Network Security Platform 8.3 8.3.7.28-8.3.3.9 Manager-Mxx30-series Release Notes McAfee Network Security Platform 8.3 Revision C Contents About this release New features Enhancements Resolved issues Installation instructions Known

More information

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved.

MarkLogic Server. Monitoring MarkLogic Guide. MarkLogic 8 February, Copyright 2015 MarkLogic Corporation. All rights reserved. Monitoring MarkLogic Guide 1 MarkLogic 8 February, 2015 Last Revised: 8.0-1, February, 2015 Copyright 2015 MarkLogic Corporation. All rights reserved. Table of Contents Table of Contents Monitoring MarkLogic

More information

Amazon AWS-Solution-Architect-Associate Exam

Amazon AWS-Solution-Architect-Associate Exam Volume: 858 Questions Question: 1 You are trying to launch an EC2 instance, however the instance seems to go into a terminated status immediately. What would probably not be a reason that this is happening?

More information

XenApp & XenDesktop Troubleshooting Assessment

XenApp & XenDesktop Troubleshooting Assessment XenApp & XenDesktop Troubleshooting Assessment Executive Summary: Customer experiencing persistent XenApp and XenDesktop performance issues with printing and logon process Root cause of performance issues

More information

vcenter Operations Management Pack for NSX-vSphere

vcenter Operations Management Pack for NSX-vSphere vcenter Operations Management Pack for NSX-vSphere vcenter Operations Manager 5.8 This document supports the version of each product listed and supports all subsequent versions until the document is replaced

More information

Release Notes Version 8.1

Release Notes Version 8.1 Please Read Before Updating Before updating to a new firmware version, be sure to back up your configuration and read the release notes for each firmware version which you will apply. Do not manually reboot

More information

extreme Scale caching alternatives for Bank ATM Offerings

extreme Scale caching alternatives for Bank ATM Offerings Customer POC Experience with WebSphere extreme Scale extreme Scale caching alternatives for Bank ATM Offerings Agenda Business and application challenges where elastic caching applies Customer POC Context

More information

Scalable Streaming Analytics

Scalable Streaming Analytics Scalable Streaming Analytics KARTHIK RAMASAMY @karthikz TALK OUTLINE BEGIN I! II ( III b Overview Storm Overview Storm Internals IV Z V K Heron Operational Experiences END WHAT IS ANALYTICS? according

More information

What s New in K8s 1.3

What s New in K8s 1.3 What s New in K8s 1.3 Carter Morgan Background: 3 Hurdles How do I write scalable apps? The App How do I package and distribute? What runtimes am I locked into? Can I scale? The Infra Is it automatic?

More information

What s New in K8s 1.3

What s New in K8s 1.3 What s New in K8s 1.3 Carter Morgan Background: 3 Hurdles How do I write scalable apps? The App How do I package and distribute? What runtimes am I locked into? Can I scale? The Infra Is it automatic?

More information

Monitor load balancer performance in a dashboard

Monitor load balancer performance in a dashboard Monitor load balancer performance in a dashboard Published: 2017-11-22 During times of high demand on an application or service, you can configure a load balancer to help with service reliability and hardware

More information

WOMBATOAM OPERATIONS & MAINTENANCE FOR ERLANG & ELIXIR SYSTEMS

WOMBATOAM OPERATIONS & MAINTENANCE FOR ERLANG & ELIXIR SYSTEMS version 3.0.0 3.0.0 IMPROVEMENTS OVER 2.0.0 MONITORING ++ Extensive dashboard redesign with a new, more intuitive user interface using GridStack. ++ Improved Mnesia netsplit service to detect and fix partitions

More information

WOMBATOAM OPERATIONS & MAINTENANCE FOR ERLANG & ELIXIR SYSTEMS

WOMBATOAM OPERATIONS & MAINTENANCE FOR ERLANG & ELIXIR SYSTEMS version 3.0.0 3.0.0 IMPROVEMENTS OVER 2.0.0 MONITORING ++ Extensive dashboard redesign with a new, more intuitive user interface using GridStack. ++ Improved Mnesia netsplit service to detect and fix partitions

More information

From an open storage solution to a clustered NAS appliance

From an open storage solution to a clustered NAS appliance From an open storage solution to a clustered NAS appliance Dr.-Ing. Jens-Peter Akelbein Manager Storage Systems Architecture IBM Deutschland R&D GmbH 1 IBM SONAS Overview Enterprise class network attached

More information

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud? DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing Slide 1 Slide 3 ➀ What is Cloud Computing? ➁ X as a Service ➂ Key Challenges ➃ Developing for the Cloud Why is it called Cloud? services provided

More information

MERC. User Guide. For Magento 2.X. Version P a g e

MERC. User Guide. For Magento 2.X. Version P a g e MERC User Guide For Magento 2.X Version 1.0.0 http://litmus7.com/ 1 P a g e Table of Contents Table of Contents... 2 1. Introduction... 3 2. Requirements... 4 3. Installation... 4 4. Configuration... 4

More information

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao

Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI Presented by Xiang Gao Bigtable: A Distributed Storage System for Structured Data By Fay Chang, et al. OSDI 2006 Presented by Xiang Gao 2014-11-05 Outline Motivation Data Model APIs Building Blocks Implementation Refinement

More information

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation

Dynamo. Smruti R. Sarangi. Department of Computer Science Indian Institute of Technology New Delhi, India. Motivation System Architecture Evaluation Dynamo Smruti R. Sarangi Department of Computer Science Indian Institute of Technology New Delhi, India Smruti R. Sarangi Leader Election 1/20 Outline Motivation 1 Motivation 2 3 Smruti R. Sarangi Leader

More information

Everything You Need to Know About MySQL Group Replication

Everything You Need to Know About MySQL Group Replication Everything You Need to Know About MySQL Group Replication Luís Soares (luis.soares@oracle.com) Principal Software Engineer, MySQL Replication Lead Copyright 2017, Oracle and/or its affiliates. All rights

More information

Optimize Your IT System Operation with NEC MasterScope Product Suite - Introduction to Fault Monitoring -

Optimize Your IT System Operation with NEC MasterScope Product Suite - Introduction to Fault Monitoring - NEC グループ外秘 Optimize Your IT System Operation with NEC MasterScope Product Suite - Introduction to Fault Monitoring - March, 2017 Cloud Platform Division, NEC Corporation Contents 1. Operational challenges

More information

CyberStore DSS. Multi Award Winning. Broadberry. CyberStore DSS. Open-E DSS v7 based Storage Appliances. Powering these organisations

CyberStore DSS. Multi Award Winning. Broadberry. CyberStore DSS. Open-E DSS v7 based Storage Appliances. Powering these organisations Multi Award Winning CyberStore DSS CyberStore DSS CyberStore DSS Storage Appliances: Open-E DSS v7 based Storage Appliances The CyberStore DSS range of iscsi SAN / NAS storage appliances include 1U-4U

More information

Designing Database Solutions for Microsoft SQL Server (465)

Designing Database Solutions for Microsoft SQL Server (465) Designing Database Solutions for Microsoft SQL Server (465) Design a database structure Design for business requirements Translate business needs to data structures; de-normalize a database by using SQL

More information

Matthias Wobben working in Berlin, Germany. Senior Sales Engineer at Nextcloud

Matthias Wobben working in Berlin, Germany. Senior Sales Engineer at Nextcloud Matthias Wobben matthias@nextcloud.com working in Berlin, Germany Senior Sales Engineer at Nextcloud Before: 3 rd level IT Engineer and Administrator at Systems Provider with focus on EFSS and collaboration

More information

vrealize Operations Management Pack for NSX for vsphere 3.0

vrealize Operations Management Pack for NSX for vsphere 3.0 vrealize Operations Management Pack for NSX for vsphere 3.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition.

More information

PI Developer Technologies Roadmap Presented by: Frank Garriel, David Hearn, & Bodo Bachmann

PI Developer Technologies Roadmap Presented by: Frank Garriel, David Hearn, & Bodo Bachmann PI Developer Technologies Roadmap Presented by: Frank Garriel, David Hearn, & Bodo Bachmann 1 PI Developer Technologies PI OPC Servers PI Web API Presented by: Frank Garriel 3 Performance & Scalability

More information

Bigtable. Presenter: Yijun Hou, Yixiao Peng

Bigtable. Presenter: Yijun Hou, Yixiao Peng Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google, Inc. OSDI 06 Presenter: Yijun Hou, Yixiao Peng

More information

<Urban.Boquist. com> Rev PA

<Urban.Boquist. com> Rev PA FP in industry - Erlang Urban Boquist Ericsson AB com> 1 Outline Who Am I Mobile Telecommunications Networks Packet Core Network GPRS & SGSN Use of Erlang in SGSN SGSN Design

More information

How OpenX built a Scalable Global Digital Revenue Platform

How OpenX built a Scalable Global Digital Revenue Platform How OpenX built a Scalable Global Digital Revenue Platform Anthony Molinaro Talk Outline A brief history of the OpenX software stack How we came to use Erlang in that stack 1998-2007 Pre-History 1998 -

More information

vrealize Operations Management Pack for NSX for vsphere 3.5.0

vrealize Operations Management Pack for NSX for vsphere 3.5.0 vrealize Operations Management Pack for NSX for vsphere 3.5.0 This document supports the version of each product listed and supports all subsequent versions until the document is replaced by a new edition.

More information

Getting Started. Update 1 Modified on 03 SEP 2017 vrealize Log Insight 4.0

Getting Started. Update 1 Modified on 03 SEP 2017 vrealize Log Insight 4.0 Update 1 Modified on 03 SEP 2017 vrealize Log Insight 4.0 You can find the most up-to-date technical documentation on the VMware website at: https://docs.vmware.com/ If you have comments about this documentation,

More information

High Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack

High Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack High Availability for Enterprise Clouds: Oracle Solaris Cluster and OpenStack Eve Kleinknecht Principal Product Manager Thorsten Früauf Principal Software Engineer November 18, 2015 Safe Harbor Statement

More information

Aaron Sun, in collaboration with Taehoon Kang, William Greene, Ben Speakmon and Chris Mills

Aaron Sun, in collaboration with Taehoon Kang, William Greene, Ben Speakmon and Chris Mills Aaron Sun, in collaboration with Taehoon Kang, William Greene, Ben Speakmon and Chris Mills INTRO About KIXEYE An online gaming company focused on mid- core and hard- core games Founded in 00 Over 00 employees

More information