Building a Scalable Architecture for Web Apps - Part I (Lessons Directi)

Similar documents
Scaling To 1 Billion Hits A Day. Chander Dhall

DEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!

Scalability of web applications

<Insert Picture Here> MySQL Web Reference Architectures Building Massively Scalable Web Infrastructure

Vineet Gupta GM Software Engineering Directi

MySQL & NoSQL: The Best of Both Worlds

Jargons, Concepts, Scope and Systems. Key Value Stores, Document Stores, Extensible Record Stores. Overview of different scalable relational systems

<Insert Picture Here> Oracle Coherence & Extreme Transaction Processing (XTP)

GridGain and Apache Ignite In-Memory Performance with Durability of Disk

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8

Database Architectures

MySQL Cluster Web Scalability, % Availability. Andrew

It also performs many parallelization operations like, data loading and query processing.

Various MySQL High Availability (HA) Solutions

Highly Available Database Architectures in AWS. Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona

<Insert Picture Here> MySQL Cluster What are we working on

ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS

Coherence & WebLogic Server integration with Coherence (Active Cache)

Craig Blitz Oracle Coherence Product Management

Postgres-XC PG session #3. Michael PAQUIER Paris, 2012/02/02

Performance and Scalability with Griddable.io

Database Architectures

MySQL High Availability. Michael Messina Senior Managing Consultant, Rolta-AdvizeX /

Conceptual Modeling on Tencent s Distributed Database Systems. Pan Anqun, Wang Xiaoyu, Li Haixiang Tencent Inc.

Real World Web Scalability. Ask Bjørn Hansen Develooper LLC

SCALABLE WEB PROGRAMMING. CS193S - Jan Jannink - 2/02/10

MySQL Replication Options. Peter Zaitsev, CEO, Percona Moscow MySQL User Meetup Moscow,Russia

High Availability/ Clustering with Zend Platform

the road to cloud native applications Fabien Hermenier

DNS and Modern Network Services. Amin Vahdat CSE 123b April 27, 2006

Increasing Performance for PowerCenter Sessions that Use Partitions

Scaling Without Sharding. Baron Schwartz Percona Inc Surge 2010

Informatica Power Center 10.1 Developer Training

What s New in MySQL 5.7 Geir Høydalsvik, Sr. Director, MySQL Engineering. Copyright 2015, Oracle and/or its affiliates. All rights reserved.

Postgres-XC PostgreSQL Conference Michael PAQUIER Tokyo, 2012/02/24

Motivation There are applications for which it is critical to establish certain availability, consistency, performance etc.

Performance comparisons and trade-offs for various MySQL replication schemes

Pragmatic Clustering. Mike Cannon-Brookes CEO, Atlassian Software Systems

Distributed KIDS Labs 1

Web Applications. Software Engineering 2017 Alessio Gambi - Saarland University

Pimp My Data Grid. Brian Oliver Senior Principal Solutions Architect <Insert Picture Here>

Project Genesis. Cafepress.com Product Catalog Hundreds of Millions of Products Millions of new products every week Accelerating growth

What is the Future of PostgreSQL?

MySQL Cluster for Real Time, HA Services

MySQL Performance Improvements

Help! I need more servers! What do I do?

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

SQL Gone Wild: Taming Bad SQL the Easy Way (or the Hard Way) Sergey Koltakov Product Manager, Database Manageability

OS and Hardware Tuning

Percona XtraDB Cluster ProxySQL. For your high availability and clustering needs

CISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL

THE DATACENTER AS A COMPUTER AND COURSE REVIEW

CSE 124: QUANTIFYING PERFORMANCE AT SCALE AND COURSE REVIEW. George Porter December 6, 2017

CMU SCS CMU SCS Who: What: When: Where: Why: CMU SCS

MySQL Cluster Student Guide

Survey of Oracle Database

CSE 544: Principles of Database Systems

Scaling for Humongous amounts of data with MongoDB

OS and HW Tuning Considerations!

Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016

Scaling DreamFactory

HA solution with PXC-5.7 with ProxySQL. Ramesh Sivaraman Krunal Bauskar

Using MySQL for Distributed Database Architectures

MySQL HA vs. HA. DOAG Konferenz 2016, Nürnberg. Oli Sennhauser. Senior MySQL Consultant, FromDual GmbH.

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Huge market -- essentially all high performance databases work this way

Chapter 18: Parallel Databases

Chapter 18: Parallel Databases. Chapter 18: Parallel Databases. Parallelism in Databases. Introduction

Deployment. Chris Wilson, AfNOG / 26

Whitepaper. 4 Ways to Improve ASP.NET Performance. Under Peak Loads. Iqbal Khan. Copyright 2015 by Alachisoft

Current Topics in OS Research. So, what s hot?

SOFTWARE ARCHITECTURES ARCHITECTURAL STYLES SCALING UP PERFORMANCE

Designing for Scalability. Patrick Linskey EJB Team Lead BEA Systems

Building High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL

CPS 512 midterm exam #1, 10/7/2016

Data Modeling and Databases Ch 14: Data Replication. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

SCYLLA: NoSQL at Ludicrous Speed. 主讲人 :ScyllaDB 软件工程师贺俊

Netezza The Analytics Appliance

BAC Guidelines. Dr. Martin Gajdzik Team Lead BTO Operations & BSM Solution Architect HPSW Professional Services

ScaleArc for SQL Server

Principal Solutions Architect. Architecting in the Cloud

! Parallel machines are becoming quite common and affordable. ! Databases are growing increasingly large

Chapter 20: Parallel Databases

Chapter 20: Parallel Databases. Introduction

Give Your Site a Boost With memcached. Ben Ramsey

Boost Your Hibernate and Application Performance

Jyotheswar Kuricheti

Scaling Out Tier Based Applications

The CORAL Project. Dirk Düllmann for the CORAL team Open Grid Forum, Database Workshop Barcelona, 4 June 2008

How we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016

CompSci 516 Database Systems

Wasser drauf, umrühren, fertig?

Chapter 17: Parallel Databases

MySQL Reference Architectures for Massively Scalable Web Infrastructure

Technology Overview ScaleArc. All Rights Reserved.

Main-Memory Databases 1 / 25

Caching patterns and extending mobile applications with elastic caching (With Demonstration)

Coherence An Introduction. Shaun Smith Principal Product Manager

Switching to Innodb from MyISAM. Matt Yonkovit Percona

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source

Transcription:

Intelligent People. Uncommon Ideas. Building a Scalable Architecture for Web Apps - Part I (Lessons Learned @ Directi) By Bhavin Turakhia CEO, Directi (http://www.directi.com http://wiki.directi.com http://careers.directi.com) Licensed under Creative Commons Attribution Sharealike Noncommercial 1

Agenda Why is Scalability important Introduction to the Variables and Factors Building our own Scalable Architecture (in incremental steps) Vertical Scaling Vertical Partitioning Horizontal Scaling Horizontal Partitioning etc Platform Selection Considerations Tips 2

Why is Scalability Important in a Web 2.0 world Viral marketing can result in instant successes RSS / Ajax / SOA pull based / polling type XML protocols - Meta-data > data Number of Requests exponentially grows with user base RoR / Grails Dynamic language landscape gaining popularity In the end you want to build a Web 2.0 app that can serve millions of users with ZERO downtime 3

The Variables Scalability - Number of users / sessions / transactions / operations the entire system can perform Performance Optimal utilization of resources Responsiveness Time taken per operation Availability - Probability of the application or a portion of the application being available at any given point in time Downtime Impact - The impact of a downtime of a server/service/resource - number of users, type of impact etc Cost Maintenance Effort High: scalability, availability, performance & responsiveness Low: downtime impact, cost & maintenance effort 4

The Factors Platform selection Hardware Application Design Database/Datastore Structure and Architecture Deployment Architecture Storage Architecture Abuse prevention Monitoring mechanisms and more 5

Lets Start We will now build an example architecture for an example app using the following iterative incremental steps Inspect current Architecture Identify Scalability Bottlenecks Identify SPOFs and Availability Issues Identify Downtime Impact Risk Zones Apply one of - Vertical Scaling Vertical Partitioning Horizontal Scaling Horizontal Partitioning Repeat process 6

Step 1 Lets Start Appserver & DBServer 7

Step 2 Vertical Scaling Appserver, DBServer RAM RAM CPU CPU 8

Step 2 - Vertical Scaling Appserver, DBServer RAM RAM RAM RAM CPU CPU CPU CPU Introduction Increasing the hardware resources without changing the number of nodes Referred to as Scaling up the Server Advantages Simple to implement Disadvantages Finite limit Hardware does not scale linearly (diminishing returns for each incremental unit) Requires downtime Increases Downtime Impact Incremental costs increase exponentially 9

Step 3 Vertical Partitioning (Services) AppServer DBServer Introduction Deploying each service on a separate node Positives Increases per application Availability Task-based specialization, optimization and tuning possible Reduces context switching Simple to implement for out of band processes No changes to App required Flexibility increases Negatives Sub-optimal resource utilization May not increase overall availability Finite Scalability 10

Understanding Vertical Partitioning The term Vertical Partitioning denotes Increase in the number of nodes by distributing the tasks/functions Each node (or cluster) performs separate Tasks Each node (or cluster) is different from the other Vertical Partitioning can be performed at various layers (App / Server / Data / Hardware etc) 11

Step 4 Horizontal Scaling (App Server) Load Balancer AppServer AppServer AppServer Introduction Increasing the number of nodes of the App Server through Load Balancing Referred to as Scaling out the App Server DBServer 12

Understanding Horizontal Scaling The term Horizontal Scaling denotes Increase in the number of nodes by replicating the nodes Each node performs the same Tasks Each node is identical Typically the collection of nodes maybe known as a cluster (though the term cluster is often misused) Also referred to as Scaling Out Horizontal Scaling can be performed for any particular type of node (AppServer / DBServer etc) 13

Load Balancer Hardware vs Software Hardware Load balancers are faster Software Load balancers are more customizable With HTTP Servers load balancing is typically combined with http accelerators 14

Load Balancer Session Management Sticky Sessions Requests for a given user are sent to a fixed App Server Observations Asymmetrical load distribution (especially during downtimes) Downtime Impact Loss of session data Sticky Sessions User 1 User 2 Load Balancer AppServer AppServer AppServer 15

Load Balancer Session Management Central Session Store Introduces SPOF An additional variable Session reads and writes generate Disk + Network I/O Also known as a Shared Session Store Cluster Central Session Storage Load Balancer AppServer AppServer AppServer Session Store 16

Load Balancer Session Management Clustered Session Management Easier to setup No SPOF Session reads are instantaneous Session writes generate Network I/O Network I/O increases exponentially with increase in number of nodes In very rare circumstances a request may get stale session data User request reaches subsequent node faster than intra-node message Intra-node communication fails AKA Shared-nothing Cluster Clustered Session Management Load Balancer AppServer AppServer AppServer 17

Load Balancer Session Management Sticky Sessions with Central Session Store Downtime does not cause loss of data Session reads need not generate network I/O Sticky Sessions with Clustered Session Management No specific advantages Sticky Sessions User 1 User 2 Load Balancer AppServer AppServer AppServer 18

Load Balancer Session Management Recommendation Use Clustered Session Management if you have Smaller Number of App Servers Fewer Session writes Use a Central Session Store elsewhere Use sticky sessions only if you have to 19

Load Balancer Removing SPOF In a Load Balanced App Server Cluster the LB is an SPOF Setup LB in Active-Active or Active-Passive mode Note: Active-Active nevertheless assumes that each LB is independently able to take up the load of the other If one wants ZERO downtime, then Active-Active becomes truly cost beneficial only if multiple LBs (more than 3 to 4) are daisy chained as Active-Active forming an LB Cluster Active-Passive LB Users Load Balancer Load Balancer AppServer AppServer AppServer Active-Active LB Users Load Balancer Load Balancer AppServer AppServer AppServer 20

Step 4 Horizontal Scaling (App Server) Load Balanced App Servers DBServer Our deployment at the end of Step 4 Positives Increases Availability and Scalability No changes to App required Easy setup Negatives Finite Scalability 21

Step 5 Vertical Partitioning (Hardware) Load Balanced App Servers DBServer SAN Introduction Partitioning out the Storage function using a SAN SAN config options Refer to Demystifying Storage at http://wiki.directi.com -> Dev University -> Presentations Positives Allows Scaling Up the DB Server Boosts Performance of DB Server Negatives Increases Cost 22

Step 6 Horizontal Scaling (DB) DBServer Load Balanced App Servers DBServer DBServer Introduction Increasing the number of DB nodes Referred to as Scaling out the DB Server Options Shared nothing Cluster Real Application Cluster (or Shared Storage Cluster) SAN 23

Shared Nothing Cluster Each DB Server node has its own complete copy of the database Nothing is shared between the DB Server Nodes This is achieved through DB Replication at DB / Driver / App level or through a proxy Supported by most RDBMs natively or through 3 rd party software DBServer Database DBServer Database Note: Actual DB files maybe stored on a central SAN DBServer Database 24

Replication Considerations Master-Slave Writes are sent to a single master which replicates the data to multiple slave nodes Replication maybe cascaded Simple setup No conflict management required Multi-Master Writes can be sent to any of the multiple masters which replicate them to other masters and slaves Conflict Management required Deadlocks possible if same data is simultaneously modified at multiple places 25

Replication Considerations Asynchronous Guaranteed, but out-of-band replication from Master to Slave Master updates its own db and returns a response to client Replication from Master to Slave takes place asynchronously Faster response to a client Slave data is marginally behind the Master Requires modification to App to send critical reads and writes to master, and load balance all other reads Synchronous Guaranteed, in-band replication from Master to Slave Master updates its own db, and confirms all slaves have updated their db before returning a response to client Slower response to a client Slaves have the same data as the Master at all times Requires modification to App to send writes to master and load balance all reads 26

Replication Considerations Replication at RDBMS level Support may exists in RDBMS or through 3 rd party tool Faster and more reliable App must send writes to Master, reads to any db and critical reads to Master Replication at Driver / DAO level Driver / DAO layer ensures writes are performed on all connected DBs Reads are load balanced Critical reads are sent to a Master In most cases RDBMS agnostic Slower and in some cases less reliable 27

Real Application Cluster All DB Servers in the cluster share a common storage area on a SAN All DB servers mount the same block device The filesystem must be a clustered file system (eg GFS / OFS) Currently only supported by Oracle Real Application Cluster Can be very expensive (licensing fees) DBServer DBServer Database SAN DBServer 28

Recommendation Try and choose a DB which natively supports Master- Slave replication Use Master-Slave Async replication Write your DAO layer to ensure writes are sent to a single DB reads are load balanced Critical reads are sent to a master DBServer Load Balanced App Servers DBServer DBServer Writes & Critical Reads Other Reads 29

Step 6 Horizontal Scaling (DB) Load Balanced App Servers DB DB DB DB Cluster Our architecture now looks like this Positives As Web servers grow, Database nodes can be added DB Server is no longer SPOF Negatives Finite limit SAN 30

Step 6 Horizontal Scaling (DB) Reads Writes DB1 DB2 Shared nothing clusters have a finite scaling limit Reads to Writes 2:1 So 8 Reads => 4 writes 2 DBs Per db 4 reads and 4 writes 4 DBs Per db 2 reads and 4 writes 8 DBs Per db 1 read and 4 writes At some point adding another node brings in negligible incremental benefit 31

Step 7 Vertical / Horizontal Partitioning (DB) Load Balanced App Servers DB DB DB DB Cluster Introduction Increasing the number of DB Clusters by dividing the data Options Vertical Partitioning - Dividing tables / columns Horizontal Partitioning - Dividing by rows (value) SAN 32

Vertical Partitioning (DB) Take a set of tables and move them onto another DB Eg in a social network - the users table and the friends table can be on separate DB clusters Each DB Cluster has different tables Application code or DAO / Driver code or a proxy knows where a given table is and directs queries to the appropriate DB Can also be done at a column level by moving a set of columns into a separate table DB Cluster 1 Table 1 Table 2 App Cluster DB Cluster 2 Table 3 Table 4 33

Vertical Partitioning (DB) Negatives One cannot perform SQL joins or maintain referential integrity (referential integrity is as such overrated) Finite Limit App Cluster DB Cluster 1 Table 1 Table 2 DB Cluster 2 Table 3 Table 4 34

Horizontal Partitioning (DB) Take a set of rows and move them onto another DB Eg in a social network each DB Cluster can contain all data for 1 million users Each DB Cluster has identical tables Application code or DAO / Driver code or a proxy knows where a given row is and directs queries to the appropriate DB Negatives SQL unions for search type queries must be performed within code DB Cluster 1 Table 1 Table 2 Table 3 Table 4 App Cluster DB Cluster 2 Table 1 Table 2 Table 3 Table 4 1 million users 1 million users 35

Horizontal Partitioning (DB) Techniques FCFS 1 st million users are stored on cluster 1 and the next on cluster 2 Round Robin Least Used (Balanced) Each time a new user is added, a DB cluster with the least users is chosen Hash based A hashing function is used to determine the DB Cluster in which the user data should be inserted Value Based User ids 1 to 1 million stored in cluster 1 OR all users with names starting from A-M on cluster 1 Except for Hash and Value based all other techniques also require an independent lookup map mapping user to Database Cluster This map itself will be stored on a separate DB (which may further need to be replicated) 36

Step 7 Vertical / Horizontal Partitioning (DB) DB DB DB DB Cluster Load Balanced App Servers SAN DB DB DB DB Cluster Lookup Map Our architecture now looks like this Positives As App servers grow, Database Clusters can be added Note: This is not the same as table partitioning provided by the db (eg MSSQL) We may actually want to further segregate these into Sets, each serving a collection of users (refer next slide 37

Step 8 Separating Sets Now we consider each deployment as a single Set serving a collection of users Global Redirector Global Lookup Map Load Balanced App Servers Lookup Map Load Balanced App Servers Lookup Map DB DB DB DB DB DB DB DB DB DB DB DB DB Cluster DB Cluster DB Cluster DB Cluster SAN SAN SET 1 10 million users SET 2 10 million users 38

Creating Sets The goal behind creating sets is easier manageability Each Set is independent and handles transactions for a set of users Each Set is architecturally identical to the other Each Set contains the entire application with all its data structures Sets can even be deployed in separate datacenters Users may even be added to a Set that is closer to them in terms of network latency 39

Step 8 Horizontal Partitioning (Sets) App Servers Cluster DB Cluster DB Cluster SAN SET 1 Global Redirector App Servers Cluster DB Cluster DB Cluster SAN SET 2 Our architecture now looks like this Positives Infinite Scalability Negatives Aggregation of data across sets is complex Users may need to be moved across Sets if sizing is improper Global App settings and preferences need to be replicated across Sets 40

Step 9 Caching Add caches within App Server Object Cache Session Cache (especially if you are using a Central Session Store) API cache Page cache Software Memcached Teracotta (Java only) Coherence (commercial expensive data grid by Oracle) 41

Step 10 HTTP Accelerator If your app is a web app you should add an HTTP Accelerator or a Reverse Proxy A good HTTP Accelerator / Reverse proxy performs the following Redirect static content requests to a lighter HTTP server (lighttpd) Cache content based on rules (with granular invalidation support) Use Async NIO on the user side Maintain a limited pool of Keep-alive connections to the App Server Intelligent load balancing Solutions Nginx (HTTP / IMAP) Perlbal Hardware accelerators plus Load Balancers 42

Step 11 Other cool stuff CDNs IP Anycasting Async Nonblocking IO (for all Network Servers) If possible - Async Nonblocking IO for disk Incorporate multi-layer caching strategy where required L1 cache in-process with App Server L2 cache across network boundary L3 cache on disk Grid computing Java GridGain Erlang natively built in 43

Platform Selection Considerations Programming Languages and Frameworks Dynamic languages are slower than static languages Compiled code runs faster than interpreted code -> use accelerators or pre-compilers Frameworks that provide Dependency Injections, Reflection, Annotations have a marginal performance impact ORMs hide DB querying which can in some cases result in poor query performance due to non-optimized querying RDBMS MySQL, MSSQL and Oracle support native replication Postgres supports replication through 3 rd party software (Slony) Oracle supports Real Application Clustering MySQL uses locking and arbitration, while Postgres/Oracle use MVCC (MSSQL just recently introduced MVCC) Cache Teracotta vs memcached vs Coherence 44

Tips All the techniques we learnt today can be applied in any order Try and incorporate Horizontal DB partitioning by value from the beginning into your design Loosely couple all modules Implement a REST-ful framework for easier caching Perform application sizing ongoingly to ensure optimal utilization of hardware 45

Intelligent People. Uncommon Ideas. Questions?? bhavin.t@directi.com http://directi.com http://careers.directi.com Download slides: http://wiki.directi.com 46