Designing, Scoping, and Configuring Scalable LAMP Infrastructure Presented 2010-05-19 by
About me
About me Founded Four Kitchens in 2006 while at UT Austin
About me Founded Four Kitchens in 2006 while at UT Austin In 2008, launched Pressflow, which now powers the largest Drupal sites
About me Founded Four Kitchens in 2006 while at UT Austin In 2008, launched Pressflow, which now powers the largest Drupal sites Worked with some of the largest sites in the world: Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist
About me Founded Four Kitchens in 2006 while at UT Austin In 2008, launched Pressflow, which now powers the largest Drupal sites Worked with some of the largest sites in the world: Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist Engineered the LAMP stack, deployment tools, and management tools for Yale University, multiple NBC- Universal properties, and Drupal.org
About me Founded Four Kitchens in 2006 while at UT Austin In 2008, launched Pressflow, which now powers the largest Drupal sites Worked with some of the largest sites in the world: Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist Engineered the LAMP stack, deployment tools, and management tools for Yale University, multiple NBC- Universal properties, and Drupal.org Engineered development workflows for Examiner.com
About me Founded Four Kitchens in 2006 while at UT Austin In 2008, launched Pressflow, which now powers the largest Drupal sites Worked with some of the largest sites in the world: Lifetime Digital, Mansueto Ventures, Wikipedia, The Internet Archive, and The Economist Engineered the LAMP stack, deployment tools, and management tools for Yale University, multiple NBC- Universal properties, and Drupal.org Engineered development workflows for Examiner.com Contributor to Drupal, Bazaar, Ubuntu, BCFG2, Varnish, and other open-source projects
Some assumptions
Some assumptions You have more than one web server
Some assumptions You have more than one web server You have root access
Some assumptions You have more than one web server You have root access You deploy to Linux (though PHP on Windows is more sane than ever)
Some assumptions You have more than one web server You have root access You deploy to Linux (though PHP on Windows is more sane than ever) Database and web servers occupy separate boxes
Some assumptions You have more than one web server You have root access You deploy to Linux (though PHP on Windows is more sane than ever) Database and web servers occupy separate boxes Your application behaves more or less like Drupal, WordPress, or MediaWiki
Understanding Load Distribution
Predicting peak traffic Traffic over the day can be highly irregular. To plan for peak loads, design as if all traffic were as heavy as the peak hour of load in a typical month and then plan for some growth.
Analyzing hit distribution
Analyzing hit distribution 100%
Analyzing hit distribution Static Content 100%
Analyzing hit distribution 30% Static Content 100%
Analyzing hit distribution 30% Static Content 100% Dynamic Pages
Analyzing hit distribution 30% Static Content 100% Dynamic Pages 70%
Analyzing hit distribution 30% Static Content 100% Dynamic Pages 70% Authenticated
Analyzing hit distribution 30% Static Content 100% Dynamic Pages 70% Authenticated 20%
Analyzing hit distribution 30% Static Content 100% Dynamic Pages Anonymous 70% Authenticated 20%
Analyzing hit distribution 30% Static Content 50% 100% Dynamic Pages Anonymous 70% Authenticated 20%
Analyzing hit distribution Static Content 30% 50% Human 100% Dynamic Pages Anonymous 70% Authenticated 20%
Analyzing hit distribution 40% Static Content 30% 50% Human 100% Dynamic Pages Anonymous 70% Authenticated 20%
Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 70% Authenticated 20%
Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% 70% Authenticated 20%
Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment 70% Authenticated 20%
Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment 3% 70% Authenticated 20%
Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment Pay Wall Bypass 3% 70% Authenticated 20%
Analyzing hit distribution 40% 100% Dynamic Pages Static Content 30% Anonymous 50% Web Crawler Human 10% No Special Treatment Pay Wall Bypass 3% 7% 70% Authenticated 20%
Throughput vs. Delivery Methods Green (Static) Yellow (Dynamic, Cacheable) Red (Dynamic) Content Delivery Network Reverse Proxy Cache PHP + APC + memcached 5000 req/s 1 2 PHP + APC 1 PHP (No APC) 1 10 req/s More dots = More throughput 1 2 Delivered by Apache without PHP Some actually can do this.
Objective Deliver hits using the fastest, most scalable method available
Layering: Less Traffic at Each Step
Layering: Less Traffic at Each Step Traffic
Layering: Less Traffic at Each Step Traffic
Layering: Less Traffic at Each Step Traffic CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic DNS Round Robin CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer DNS Round Robin CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer DNS Round Robin CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache DNS Round Robin CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache DNS Round Robin CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache Application Server DNS Round Robin CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache Application Server DNS Round Robin CDN
Layering: Less Traffic at Each Step Your Datacenter Traffic Load Balancer Reverse Proxy Cache Application Server DNS Round Robin CDN Database
Offload from the master database Your master database is the single greatest limitation on scalability.
Offload from the master database Your master database is the single greatest limitation on scalability. Application Server Master Database
Offload from the master database Your master database is the single greatest limitation on scalability. Application Server Memory Cache Master Database
Offload from the master database Your master database is the single greatest limitation on scalability. Application Server Slave Database Memory Cache Master Database
Offload from the master database Search Your master database is the single greatest limitation on scalability. Application Server Slave Database Memory Cache Master Database
Tools to use
Tools to use Apache Solr or Sphinx for search Solr can be fronted with Varnish or another proxy cache if queries are repetitive.
Tools to use Apache Solr or Sphinx for search Solr can be fronted with Varnish or another proxy cache if queries are repetitive. Varnish, nginx, Squid, or Traffic Server for reverse proxy caching
Tools to use Apache Solr or Sphinx for search Solr can be fronted with Varnish or another proxy cache if queries are repetitive. Varnish, nginx, Squid, or Traffic Server for reverse proxy caching Any third-party service for CDN
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers.
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic Load Balancer
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic Load Balancer Reverse Proxy Cache
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic Load Balancer Reverse Proxy Cache
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic Load Balancer Reverse Proxy Cache Application Server
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic Load Balancer Reverse Proxy Cache Application Server
Do the math All non-cdn traffic travels through your load balancers and reverse proxy caches. Even traffic passed through to application servers must run through the initial layers. Internal Traffic Load Balancer Reverse Proxy Cache Application Server What hit rate is each layer getting? How many servers share the load?
Get a management/monitoring box
Get a management/monitoring box Management
Get a management/monitoring box Management Application Server
Get a management/monitoring box Management Application Server Reverse Proxy Cache
Get a management/monitoring box Database Management Application Server Reverse Proxy Cache
Get a management/monitoring box Load Balancer Database Management Application Server Reverse Proxy Cache
Get a management/monitoring box Load Balancer (maybe even two and have them specialize or be redundant) Database Management Application Server Reverse Proxy Cache
Planning + Scoping
Infrastructure goals
Infrastructure goals Redundancy: tolerate failure
Infrastructure goals Redundancy: tolerate failure Scalability: engage more users
Infrastructure goals Redundancy: tolerate failure Scalability: engage more users Performance: ensure each user s experience is fast
Infrastructure goals Redundancy: tolerate failure Scalability: engage more users Performance: ensure each user s experience is fast Manageability: stay sane in the process
Redundancy
Redundancy When one server fails, the website should be able to recover without taking too long.
Redundancy When one server fails, the website should be able to recover without taking too long. This requires at least N+1, putting a floor on system requirements even for small sites.
Redundancy When one server fails, the website should be able to recover without taking too long. This requires at least N+1, putting a floor on system requirements even for small sites. How long can your site be down?
Redundancy When one server fails, the website should be able to recover without taking too long. This requires at least N+1, putting a floor on system requirements even for small sites. How long can your site be down? Automatic versus manual failover
Redundancy When one server fails, the website should be able to recover without taking too long. This requires at least N+1, putting a floor on system requirements even for small sites. How long can your site be down? Automatic versus manual failover Warning: over-automation can reduce uptime
Performance
Performance Find the sweet spot for hardware. This is the best price/performance point.
Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component
Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component Yet, avoid creating bottlenecks
Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component Yet, avoid creating bottlenecks Swapping memory to disk is very dangerous
Performance Find the sweet spot for hardware. This is the best price/performance point. Avoid overspending on any type of component Yet, avoid creating bottlenecks Swapping memory to disk is very dangerous Don t skimp on RAM
Relative importance Processors/Cores Memory Disk Speed Reverse Proxy Cache Web Server Database Server Monitoring
All of your servers
All of your servers 64-bit: no excuse to use anything less in 2010
All of your servers 64-bit: no excuse to use anything less in 2010 RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP
All of your servers 64-bit: no excuse to use anything less in 2010 RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP But pick one, and stick with it for development, staging, and production
All of your servers 64-bit: no excuse to use anything less in 2010 RHEL/CentOS and Ubuntu have the broadest adoption for large-scale LAMP But pick one, and stick with it for development, staging, and production Some disk redundancy: rebuilding a server is time-consuming unless you re very automated
Reverse proxy caches
Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL
Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL Squid and Traffic Server are clunky but reliable alternatives
Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL Squid and Traffic Server are clunky but reliable alternatives CPU Save Your Money
Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL Squid and Traffic Server are clunky but reliable alternatives CPU Save Your Money Memory + 1 GB base system + 3 GB for caching
Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL Squid and Traffic Server are clunky but reliable alternatives CPU Save Your Money Memory + 1 GB base system + 3 GB for caching + Disk Slow + Small + Redundant
Reverse proxy caches Varnish and nginx have modern architecture and broad adoption Sites often front Varnish with nginx for gzip and/or SSL Squid and Traffic Server are clunky but reliable alternatives CPU Save Your Money Memory + 1 GB base system + 3 GB for caching + Disk Slow + Small + Redundant = 5000 req/s
Web servers
Web servers Apache 2.2 + mod_php + memcached
Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode
Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode Check the memory your app takes per process
Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode Check the memory your app takes per process Tune MaxClients to around 25 cores
Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode Check the memory your app takes per process Tune MaxClients to around 25 cores CPU Max out cores (but prefer fast cores to density)
Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode Check the memory your app takes per process Tune MaxClients to around 25 cores CPU Max out cores (but prefer fast cores to density) Memory + 1 GB base system + 1 GB memcached + 25 cores perprocess app memory
Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode Check the memory your app takes per process Tune MaxClients to around 25 cores CPU Max out cores (but prefer fast cores to density) Memory + 1 GB base system + 1 GB memcached + 25 cores perprocess app memory + Disk Slow + Small + Redundant
Web servers Apache 2.2 + mod_php + memcached FastCGI is a bad idea Memory improvements are redundant w/ Varnish Higher latency + less efficient with APC opcode Check the memory your app takes per process Tune MaxClients to around 25 cores CPU Max out cores (but prefer fast cores to density) Memory + 1 GB base system + 1 GB memcached + 25 cores perprocess app memory + Disk Slow + Small + Redundant = 100 req/s
Database servers
Database servers Insist on MySQL 5.1+ and InnoDB
Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB
Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom
Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom Tune the InnoDB buffer pool to at least half of RAM
Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom Tune the InnoDB buffer pool to at least half of RAM CPU No more than 8-12 cores
Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom Tune the InnoDB buffer pool to at least half of RAM CPU No more than 8-12 cores Memory + As much as you can afford (even RAM not used by MySQL caches disk content)
Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom Tune the InnoDB buffer pool to at least half of RAM CPU No more than 8-12 cores Memory + As much as you can afford (even RAM not used by MySQL caches disk content) + Disk Fast + Large + Redundant
Database servers Insist on MySQL 5.1+ and InnoDB Consider Percona builds and (eventually) MariaDB Every Apache process generally needs at least one connection available, and leave some headroom Tune the InnoDB buffer pool to at least half of RAM CPU No more than 8-12 cores Memory + As much as you can afford (even RAM not used by MySQL caches disk content) + Disk Fast + Large + Redundant = 3000 queries/s
Management server
Management server Nagios: service outage monitoring
Management server Nagios: service outage monitoring Cacti: trend monitoring
Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation
Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution
Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution Puppet/BCFG2/Chef: configuration management
Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution Puppet/BCFG2/Chef: configuration management CPU Save Your Money
Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution Puppet/BCFG2/Chef: configuration management CPU Save Your Money Memory + Save Your Money
Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution Puppet/BCFG2/Chef: configuration management CPU Save Your Money Memory + Save Your Money + Disk Slow + Large + Redundant
Management server Nagios: service outage monitoring Cacti: trend monitoring Hudson: builds, deployment, and automation Yum/Apt repo: cluster package distribution Puppet/BCFG2/Chef: configuration management CPU Save Your Money Memory + Save Your Money + Disk Slow + Large + Redundant = good enough
Assembling the numbers
Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack
Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack Increase the number of proxy caches based on anonymous and search engine traffic.
Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack Increase the number of proxy caches based on anonymous and search engine traffic. Increase the number of web servers based on authenticated traffic.
Assembling the numbers Start with an architecture providing redundancy. Two servers, each running the whole stack Increase the number of proxy caches based on anonymous and search engine traffic. Increase the number of web servers based on authenticated traffic. Databases are harder to predict, but large sites should run them on at least two separate boxes with replication.
Extreme measures for performance and scalability
When caching and search offloading isn t enough
When caching and search offloading isn t enough Some sites have intense custom page needs High proportion of authenticated users Lots of targeted content for anonymous users
When caching and search offloading isn t enough Some sites have intense custom page needs High proportion of authenticated users Lots of targeted content for anonymous users Too much data to process real-time on an RDBMS
When caching and search offloading isn t enough Some sites have intense custom page needs High proportion of authenticated users Lots of targeted content for anonymous users Too much data to process real-time on an RDBMS Data is so volatile that maintaing standard caches outweighs the overhead of regeneration
Non-relational/NoSQL tools
Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines
Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines In some cases, like MongoDB, easier to use than SQL in addition to being higher performance
Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines In some cases, like MongoDB, easier to use than SQL in addition to being higher performance Interested? You ve already missed the tutorial.
Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines In some cases, like MongoDB, easier to use than SQL in addition to being higher performance Interested? You ve already missed the tutorial. In other cases, like Cassandra, considerably harder to use than SQL but massively scalable
Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines In some cases, like MongoDB, easier to use than SQL in addition to being higher performance Interested? You ve already missed the tutorial. In other cases, like Cassandra, considerably harder to use than SQL but massively scalable Current Erlang-based systems are neat but slow
Non-relational/NoSQL tools Most web applications can run well on less-than-acid persistence engines In some cases, like MongoDB, easier to use than SQL in addition to being higher performance Interested? You ve already missed the tutorial. In other cases, like Cassandra, considerably harder to use than SQL but massively scalable Current Erlang-based systems are neat but slow Many require a special PHP extension, at least for ideal performance
Offline processing
Offline processing Gearman Primarily asynchronous job manager
Offline processing Gearman Primarily asynchronous job manager Hadoop MapReduce framework
Offline processing Gearman Primarily asynchronous job manager Hadoop MapReduce framework Traditional message queues ActiveMQ + Stomp is easy from PHP Allows you to build your own job manager
Edge-side includes
Edge-side includes ESI Processor (Varnish, Akamai, other)
Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> ESI Processor (Varnish, Akamai, other)
Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> ESI Processor (Varnish, Akamai, other) <div> My block HTML. </div>
Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> ESI Processor (Varnish, Akamai, other) <div> My block HTML. </div> <html> <body> <div> My block HTML. </div> </body> </html>
Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> Blocks of HTML are integrated into the page at the edge layer. ESI Processor (Varnish, Akamai, other) <div> My block HTML. </div> <html> <body> <div> My block HTML. </div> </body> </html>
Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> Blocks of HTML are integrated into the page at the edge layer. ESI Processor (Varnish, Akamai, other) <html> <body> <div> My block HTML. </div> </body> </html> <div> My block HTML. </div> Non-primary page content often occupies >50% of PHP execution time.
Edge-side includes <html> <body> <esi:include href= http://drupal.org/block/views/3 /> </body> </html> Blocks of HTML are integrated into the page at the edge layer. ESI Processor (Varnish, Akamai, other) <html> <body> <div> My block HTML. </div> </body> </html> <div> My block HTML. </div> Non-primary page content often occupies >50% of PHP execution time. Decouples block and page cache lifetimes
HipHop PHP
HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server
HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server Supports a subset of PHP and extensions
HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server Supports a subset of PHP and extensions Requires an organizational commitment to building, testing, and deploying on HipHop
HipHop PHP Compiles PHP to a C++-based binary Integrated HTTP server Supports a subset of PHP and extensions Requires an organizational commitment to building, testing, and deploying on HipHop Scott MacVicar has a presentation on HipHop later today at 16:00.
Cluster Problems Credits
Server failure
Server failure Load balancers can remove broken or overloaded application reverse proxy caches.
Server failure Load balancers can remove broken or overloaded application reverse proxy caches. Reverse proxy caches like Varnish can automatically use only functional application servers.
Server failure Load balancers can remove broken or overloaded application reverse proxy caches. Reverse proxy caches like Varnish can automatically use only functional application servers. Memcached clients automatically handle failure.
Server failure Load balancers can remove broken or overloaded application reverse proxy caches. Reverse proxy caches like Varnish can automatically use only functional application servers. Memcached clients automatically handle failure. Virtual service IP management tools like heartbeat2 can manage which MySQL servers receive connections to automate failover.
Server failure Load balancers can remove broken or overloaded application reverse proxy caches. Reverse proxy caches like Varnish can automatically use only functional application servers. Memcached clients automatically handle failure. Virtual service IP management tools like heartbeat2 can manage which MySQL servers receive connections to automate failover. Conclusion: Each layer intelligently monitors and uses the servers beneath it.
Cluster coherency
Cluster coherency Systems that run properly on single boxes may lose coherency when run on a networked cluster.
Cluster coherency Systems that run properly on single boxes may lose coherency when run on a networked cluster. Some caches, like APC s object cache, have no ability to handle network-level coherency. (APC s opcode cache is safe to use on clusters, though.)
Cluster coherency Systems that run properly on single boxes may lose coherency when run on a networked cluster. Some caches, like APC s object cache, have no ability to handle network-level coherency. (APC s opcode cache is safe to use on clusters, though.) memcached, if misconfigured, can hash values inconsistently across the cluster, resulting in different servers using different memcached instances for the same keys.
Cluster coherency Systems that run properly on single boxes may lose coherency when run on a networked cluster. Some caches, like APC s object cache, have no ability to handle network-level coherency. (APC s opcode cache is safe to use on clusters, though.) memcached, if misconfigured, can hash values inconsistently across the cluster, resulting in different servers using different memcached instances for the same keys. Session coherency issues can be helped with load balancer affinity or storage in memcached
Cache regeneration races
Cache regeneration races Downside to network cache coherency: synched expiration
Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper)
Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item
Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item Time
Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item Time Expiration
Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item All servers regenerating the item. { Time Expiration
Cache regeneration races Downside to network cache coherency: synched expiration Requires a locking framework (like ZooKeeper) Old Cached Item All servers regenerating the item. { New Cached Item Time Expiration
Broken replication
Broken replication MySQL slave servers get out of synch, fall further behind
Broken replication MySQL slave servers get out of synch, fall further behind No (sane) method of automated recovery
Broken replication MySQL slave servers get out of synch, fall further behind No (sane) method of automated recovery Only solvable with good monitoring and recovery procedures
Broken replication MySQL slave servers get out of synch, fall further behind No (sane) method of automated recovery Only solvable with good monitoring and recovery procedures Can automate DB slave blacklisting from use, but requires cluster management tools
All content in this presentation, except where noted otherwise, is Creative Commons Attribution- ShareAlike 3.0 licensed and copyright 2009 Four Kitchen Studios, LLC.
DrupalCamp Stockholm Presentation Ended Here
Managing the Cluster Credits
The problem Software and Configuration Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Objectives: Fast, atomic deployment and rollback Minimize single points of failure and contention Restart services Integrate with version control systems Credits
Manual updates and deployment Human Human Human Human Human Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Why not: slow deployment, non-atomic/difficult rollbacks Credits
Shared storage Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server NFS Why not: single point of contention and failure Credits
rsync Synchronized with rsync Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Why not: non-atomic, does not manage services Credits
Capistrano Deployed with Capistrano Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Capistrano provides near-atomic deployment, service restarts, automated rollback, test automation, and version control integration (tagged releases). Credits
Multistage deployment Deployed with Capistrano Deployments can be staged. cap staging deploy cap production deploy Deployed with Capistrano Development Integration Deployed with Capistrano Staging Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Credits
But your application isn t the only thing to manage. Credits
Beneath the application Reverse Proxy Cache Cluster-level configuration Database Applicati on Server Applicati on Server Applicati on Server Applicati on Server Applicati on Server Cluster management applies to package management, updates, and software configuration. cfengine and bcfg2 are popular cluster-level system configuration tools. Credits
System configuration management Deploys and updates packages, cluster-wide or selectively. Manages arbitrary text configuration files Analyzes inconsistent configurations (and converges them) Manages device classes (app. servers, database servers, etc.) Allows confident configuration testing on a staging server. Credits
All on the management box Manageme nt {Developme nt Integration Staging Deploymen t Tools Monitoring Credits
Monitoring Credits
Types of monitoring Failure Capacity/Load Analyzing Downtime Viewing Failover Troubleshooting Notification Analyzing Trends Predicting Load Checking Results of Configuration and Software Changes
Everyone needs both. Credits
What to use Failure/Uptime Capacity/Load Nagios Hyperic Cacti Munin
Nagios Highly recommended. Used by Four Kitchens and Tag1 Consulting for client work, Drupal.org, Wikipedia, etc. Easy to install on CentOS 5 using EPEL packages. Easy to install nrpe agents to monitor diverse services. Can notify administrators on failure. We use this on Drupal.org
Cacti Highly annoying to set up. One instance generally collects all statistics. (No agents on the systems being monitored.) Provides flexible graphs that can be customized on demand. Credits
Munin Fairly easy to set up. One instance generally collects all statistics. (No agents on the systems being monitored.) Provides static graphs that cannot be customized. Credits
Pressflow Make Drupal sites scale by upgrading core with a compatible, powerful replacement.
Common large-site issues Drupal core requires patching to effectively support the advanced scalability techniques discussed here. Patches often conflict and have to be reapplied with each Drupal upgrade. The original patches are often unmaintained. Sites stagnate, running old, insecure versions of Drupal core because updating is too difficult.
What is Pressflow? Pressflow is a derivative of Drupal core that integrates the most popular performance and scalability enhancements. Pressflow is completely compatible with existing Drupal 5 and 6 modules, both standard and custom. Pressflow installs as a drop-in replacement for standard Drupal. Pressflow is free as long as the matching version of Drupal is also supported by the community.
What are the enhancements? Reverse proxy support Database replication support Lower database and session management load More efficient queries Testing and optimization by Four Kitchens with standard high-performance software and hardware configuration Industry-leading scalability support by Four Kitchens and Tag1 Consulting
Four Kitchens + Tag1 Provide the development, support, scalability, and performance services behind Pressflow Comprise most members of the Drupal.org infrastructure team Have the most experience scaling Drupal sites of all sizes and all types
Ready to scale? Learn more about Pressflow: Pick up pamphlets in the lobby Request Pressflow releases at fourkitchens.com Get the help you need to make it happen: Talk to me (David) or Todd here at DrupalCamp Email shout@fourkitchens.com