Designing, Scoping, and Configuring Scalable LAMP Infrastructure

Designing, Scoping, and Configuring Scalable LAMP Infrastructure Presented 2010-05-19 by

About me

About me Founded Four Kitchens in 2006 while at UT Austin

About me Founded Four Kitchens in 2006 while at UT Austin In 2008, launched Pressflow, which now powers the largest Drupal sites

About me Founded Four Kitchens in 2006 while at UT Austin In 2008, launched Pressflow, which now powers the largest Drupal sites Worked with some of the largest sites in the world: Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist Engineered the LAMP stack, deployment tools, and management tools for Yale University, multiple NBC- Universal properties, and Drupal.org

Some assumptions

Some assumptions You have more than one web server

Some assumptions You have more than one web server You have root access

Some assumptions You have more than one web server You have root access You deploy to Linux (though PHP on Windows is more sane than ever)

Some assumptions You have more than one web server You have root access You deploy to Linux (though PHP on Windows is more sane than ever) Database and web servers occupy separate boxes

Some assumptions You have more than one web server You have root access You deploy to Linux (though PHP on Windows is more sane than ever) Database and web servers occupy separate boxes Your application behaves more or less like Drupal, WordPress, or MediaWiki

Understanding Load Distribution

Predicting peak traffic Traffic over the day can be highly irregular. To plan for peak loads, design as if all traffic were as heavy as the peak hour of load in a typical month and then plan for some growth.

Analyzing hit distribution

Analyzing hit distribution 100%

Analyzing hit distribution Static Content 100%

Analyzing hit distribution 30% Static Content 100%

Analyzing hit distribution 30% Static Content 100% Dynamic Pages

Analyzing hit distribution 30% Static Content 100% Dynamic Pages 70%

Analyzing hit distribution 30% Static Content 100% Dynamic Pages 70% Authenticated

Analyzing hit distribution 30% Static Content 100% Dynamic Pages 70% Authenticated 20%

Analyzing hit distribution 30% Static Content 100% Dynamic Pages Anonymous 70% Authenticated 20%

Analyzing hit distribution 30% Static Content 50% 100% Dynamic Pages Anonymous 70% Authenticated 20%

Analyzing hit distribution Static Content 30% 50% Human 100% Dynamic Pages Anonymous 70% Authenticated 20%

Analyzing hit distribution 40% Static Content 30% 50% Human 100% Dynamic Pages Anonymous 70% Authenticated 20%

Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 70% Authenticated 20%

Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% 70% Authenticated 20%

Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment 70% Authenticated 20%

Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment 3% 70% Authenticated 20%

Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment Pay Wall Bypass 3% 70% Authenticated 20%

Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment Pay Wall Bypass 3% 7% 70% Authenticated 20%

Throughput vs. Delivery Methods Green (Static) Yellow (Dynamic, Cacheable) Red (Dynamic) Content Delivery Network Reverse Proxy Cache PHP + APC + memcached 5000 req/s 1 2 PHP + APC 1 PHP (No APC) 1 10 req/s More dots = More throughput 1 2 Delivered by Apache without PHP Some actually can do this.

Objective Deliver hits using the fastest, most scalable method available

Layering: Less Traffic at Each Step

Layering: Less Traffic at Each Step Traffic

Layering: Less Traffic at Each Step Traffic CDN

Layering: Less Traffic at Each Step Your Datacenter Traffic CDN

Layering: Less Traffic at Each Step Your Datacenter Traffic DNS Round Robin CDN

Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer DNS Round Robin CDN

Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache DNS Round Robin CDN

Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache Application Server DNS Round Robin CDN

Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache Application Server DNS Round Robin CDN Database

Offload from the master database Your master database is the single greatest limitation on scalability.

Offload from the master database Your master database is the single greatest limitation on scalability. Application Server Master Database

Offload from the master database Your master database is the single greatest limitation on scalability. Application Server Memory Cache Master Database

Offload from the master database Your master database is the single greatest limitation on scalability. Application Server Slave Database Memory Cache Master Database

Offload from the master database Search Your master database is the single greatest limitation on scalability. Application Server Slave Database Memory Cache Master Database

Tools to use

Tools to use Apache Solr or Sphinx for search Solr can be fronted with Varnish or another proxy cache if queries are repetitive.

Tools to use Apache Solr or Sphinx for search Solr can be fronted with Varnish or another proxy cache if queries are repetitive. Varnish, nginx, Squid, or Traffic Server for reverse proxy caching

Tools to use Apache Solr or Sphinx for search Solr can be fronted with Varnish or another proxy cache if queries are repetitive. Varnish, nginx, Squid, or Traffic Server for reverse proxy caching Any third-party service for CDN

Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.

Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic

Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic Load Balancer Reverse Proxy Cache

Get a management/monitoring box

Get a management/monitoring box Management

Get a management/monitoring box Management Application Server

Get a management/monitoring box Management Application Server Reverse Proxy Cache

Get a management/monitoring box Database Management Application Server Reverse Proxy Cache

Get a management/monitoring box Load Balancer Database Management Application Server Reverse Proxy Cache

Get a management/monitoring box Load Balancer (maybe even two and have them specialize or be redundant) Database Management Application Server Reverse Proxy Cache

Planning + Scoping

Infrastructure goals

Infrastructure goals Redundancy: tolerate failure

Infrastructure goals Redundancy: tolerate failure Scalability: engage more users

Infrastructure goals Redundancy: tolerate failure Scalability: engage more users Performance: ensure each user s experience is fast

Infrastructure goals Redundancy: tolerate failure Scalability: engage more users Performance: ensure each user s experience is fast Manageability: stay sane in the process

Redundancy

Redundancy When one server fails, the website should be able to recover without taking too long.

Redundancy When one server fails, the website should be able to recover without taking too long. This requires at least N+1, putting a floor on system requirements even for small sites.

Redundancy When one server fails, the website should be able to recover without taking too long. This requires at least N+1, putting a floor on system requirements even for small sites. How long can your site be down?

Performance

Performance Find the sweet spot for hardware. This is the best price/performance point.

Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component

Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component Yet, avoid creating bottlenecks

Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component Yet, avoid creating bottlenecks Swapping memory to disk is very dangerous

Relative importance Processors/Cores Memory Disk Speed Reverse Proxy Cache Web Server Database Server Monitoring

All of your servers

All of your servers 64-bit: no excuse to use anything less in 2010

All of your servers 64-bit: no excuse to use anything less in 2010 RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP

All of your servers 64-bit: no excuse to use anything less in 2010 RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP But pick one, and stick with it for development, staging, and production

Reverse proxy caches

Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL

Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL Squid and Traffic Server are clunky but reliable alternatives CPU Save Your Money

Web servers

Web servers Apache 2.2 + mod_php + memcached

Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode

Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode Check the memory your app takes per process Tune MaxClients to around 25 cores CPU Max out cores (but prefer fast cores to density)

Database servers

Database servers Insist on MySQL 5.1+ and InnoDB

Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB

Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom

Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom Tune the InnoDB buffer pool to at least half of RAM CPU No more than 8-12 cores Memory + As much as you can afford (even RAM not used by MySQL caches disk content)

Management server

Management server Nagios: service outage monitoring

Management server Nagios: service outage monitoring Cacti: trend monitoring

Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation

Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution

Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution Puppet/BCFG2/Chef: configuration management CPU Save Your Money

Assembling the numbers

Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack

Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack Increase the number of proxy caches based on anonymous and search engine traffic.

Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack Increase the number of proxy caches based on anonymous and search engine traffic. Increase the number of web servers based on authenticated traffic. Databases are harder to predict, but large sites should run them on at least two separate boxes with replication.

Extreme measures for performance and scalability

When caching and search offloading isn t enough

When caching and search offloading isn t enough Some sites have intense custom page needs High proportion of authenticated users Lots of targeted content for anonymous users

When caching and search offloading isn t enough Some sites have intense custom page needs High proportion of authenticated users Lots of targeted content for anonymous users Too much data to process real-time on an RDBMS

Non-relational/NoSQL tools

Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines

Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines In some cases, like MongoDB, easier to use than SQL in addition to being higher performance

Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines In some cases, like MongoDB, easier to use than SQL in addition to being higher performance Interested? You ve already missed the tutorial. In other cases, like Cassandra, considerably harder to use than SQL but massively scalable

Offline processing

Offline processing Gearman Primarily asynchronous job manager

Offline processing Gearman Primarily asynchronous job manager Hadoop MapReduce framework

Offline processing Gearman Primarily asynchronous job manager Hadoop MapReduce framework Traditional message queues ActiveMQ + Stomp is easy from PHP Allows you to build your own job manager

Edge-side includes

Edge-side includes ESI Processor (Varnish, Akamai, other)

Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> ESI Processor (Varnish, Akamai, other)

Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> ESI Processor (Varnish, Akamai, other) <div> My block HTML. </div>

Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> ESI Processor (Varnish, Akamai, other) <div> My block HTML. </div> <html> <body> <div> My block HTML. </div> </body> </html>

Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> Blocks of HTML are integrated into the page at the edge layer. ESI Processor (Varnish, Akamai, other) <div> My block HTML. </div> <html> <body> <div> My block HTML. </div> </body> </html>

Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> Blocks of HTML are integrated into the page at the edge layer. ESI Processor (Varnish, Akamai, other) <html> <body> <div> My block HTML. </div> </body> </html> <div> My block HTML. </div> Non-primary page content often occupies >50% of PHP execution time.

Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> Blocks of HTML are integrated into the page at the edge layer. ESI Processor (Varnish, Akamai, other) <html> <body> <div> My block HTML. </div> </body> </html> <div> My block HTML. </div> Non-primary page content often occupies >50% of PHP execution time. Decouples block and page cache lifetimes

HipHop PHP

HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server

HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server Supports a subset of PHP and extensions

HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server Supports a subset of PHP and extensions Requires an organizational commitment to building, testing, and deploying on HipHop

HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server Supports a subset of PHP and extensions Requires an organizational commitment to building, testing, and deploying on HipHop Scott MacVicar has a presentation on HipHop later today at 16:00.

Cluster Problems Credits

Server failure

Server failure Load balancers can remove broken or overloaded application reverse proxy caches.

Server failure Load balancers can remove broken or overloaded application reverse proxy caches. Reverse proxy caches like Varnish can automatically use only functional application servers.

Server failure Load balancers can remove broken or overloaded application reverse proxy caches. Reverse proxy caches like Varnish can automatically use only functional application servers. Memcached clients automatically handle failure. Virtual service IP management tools like heartbeat2 can manage which MySQL servers receive connections to automate failover.

Cluster coherency

Cluster coherency Systems that run properly on single boxes may lose coherency when run on a networked cluster.

Cluster coherency Systems that run properly on single boxes may lose coherency when run on a networked cluster. Some caches, like APC s object cache, have no ability to handle network-level coherency. (APC s opcode cache is safe to use on clusters, though.) memcached, if misconfigured, can hash values inconsistently across the cluster, resulting in different servers using different memcached instances for the same keys.

Cache regeneration races

Cache regeneration races Downside to network cache coherency: synched expiration

Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper)

Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item

Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item Time

Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item Time Expiration

Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item All servers regenerating the item. { Time Expiration

Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item All servers regenerating the item. { New Cached Item Time Expiration

Broken replication

Broken replication MySQL slave servers get out of synch, fall further behind

Broken replication MySQL slave servers get out of synch, fall further behind No (sane) method of automated recovery

Broken replication MySQL slave servers get out of synch, fall further behind No (sane) method of automated recovery Only solvable with good monitoring and recovery procedures

Broken replication MySQL slave servers get out of synch, fall further behind No (sane) method of automated recovery Only solvable with good monitoring and recovery procedures Can automate DB slave blacklisting from use, but requires cluster management tools

DrupalCamp Stockholm Presentation Ended Here

Managing the Cluster Credits

The problem Software and Configuration Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Objectives: Fast, atomic deployment and rollback Minimize single points of failure and contention Restart services Integrate with version control systems Credits

Manual updates and deployment Human Human Human Human Human Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Why not: slow deployment, non-atomic/difficult rollbacks Credits

Shared storage Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server NFS Why not: single point of contention and failure Credits

rsync Synchronized with rsync Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Why not: non-atomic, does not manage services Credits

Capistrano Deployed with Capistrano Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Capistrano provides near-atomic deployment, service restarts, automated rollback, test automation, and version control integration (tagged releases). Credits

Multistage deployment Deployed with Capistrano Deployments can be staged. cap staging deploy cap production deploy Deployed with Capistrano Development Integration Deployed with Capistrano Staging Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Credits

But your application isn t the only thing to manage. Credits

Beneath the application Reverse Proxy Cache Cluster-level configuration Database Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Cluster management applies to package management, updates, and software configuration. cfengine and bcfg2 are popular cluster-level system configuration tools. Credits

System configuration management Deploys and updates packages, cluster-wide or selectively. Manages arbitrary text configuration files Analyzes inconsistent configurations (and converges them) Manages device classes (app. servers, database servers, etc.) Allows confident configuration testing on a staging server. Credits

All on the management box Manageme nt {Developme nt Integration Staging Deploymen t Tools Monitoring Credits

Monitoring Credits

Types of monitoring Failure Capacity/Load Analyzing Downtime Viewing Failover Troubleshooting Notification Analyzing Trends Predicting Load Checking Results of Configuration and Software Changes

Everyone needs both. Credits

What to use Failure/Uptime Capacity/Load Nagios Hyperic Cacti Munin

Nagios Highly recommended. Used by Four Kitchens and Tag1 Consulting for client work, Drupal.org, Wikipedia, etc. Easy to install on CentOS 5 using EPEL packages. Easy to install nrpe agents to monitor diverse services. Can notify administrators on failure. We use this on Drupal.org

Cacti Highly annoying to set up. One instance generally collects all statistics. (No agents on the systems being monitored.) Provides flexible graphs that can be customized on demand. Credits

Munin Fairly easy to set up. One instance generally collects all statistics. (No agents on the systems being monitored.) Provides static graphs that cannot be customized. Credits

Pressflow Make Drupal sites scale by upgrading core with a compatible, powerful replacement.

Common large-site issues Drupal core requires patching to effectively support the advanced scalability techniques discussed here. Patches often conflict and have to be reapplied with each Drupal upgrade. The original patches are often unmaintained. Sites stagnate, running old, insecure versions of Drupal core because updating is too difficult.

What is Pressflow? Pressflow is a derivative of Drupal core that integrates the most popular performance and scalability enhancements. Pressflow is completely compatible with existing Drupal 5 and 6 modules, both standard and custom. Pressflow installs as a drop-in replacement for standard Drupal. Pressflow is free as long as the matching version of Drupal is also supported by the community.

What are the enhancements? Reverse proxy support Database replication support Lower database and session management load More efficient queries Testing and optimization by Four Kitchens with standard high-performance software and hardware configuration Industry-leading scalability support by Four Kitchens and Tag1 Consulting

Four Kitchens + Tag1 Provide the development, support, scalability, and performance services behind Pressflow Comprise most members of the Drupal.org infrastructure team Have the most experience scaling Drupal sites of all sizes and all types

Ready to scale? Learn more about Pressflow: Pick up pamphlets in the lobby Request Pressflow releases at fourkitchens.com Get the help you need to make it happen: Talk to me (David) or Todd here at DrupalCamp Email shout@fourkitchens.com