Quantifying Trends in Server Power Usage Richard Gimarc CA Technologies Richard.Gimarc@ca.com October 13, 215 215 CA Technologies. All rights reserved.
What are we going to talk about? Are today s servers making better use of the power they consume? Why is this an important question? Power consumption is the primary reason companies have to build or upgrade their data centers (very expensive) Key data center resources: power, space & cooling Power demand & usage is #1 concern IT capacity planners need to have a better understanding of server power usage How are we going to address the question? Use the SPECpower benchmark as our data source What does the benchmark tell us about server power usage? Can we learn anything from 8 years of benchmark results (27-215)? Are there trends that indicate an improvement in greenness? Quantifying Trends in Server Power Usage 1
Agenda SPECpower - benchmark overview o Configuration o Workload o Scope o Publication history & profile SPECpower - sample results What can we learn from SPECpower results? o Are load level & power usage related? o Has productivity improved? o How has the range of server-level power use changed? o Do we see a change in processing efficiency? o Can you estimate electricity cost? Summary Next Steps Quantifying Trends in Server Power Usage 2
SPECpower Benchmark Introduction [1 of 7] SPECpower_ssj28 Initially released in Dec 27 First industry-standard SPEC benchmark that evaluates the power and performance characteristics of volume-class servers Benchmark goal Use a single benchmark to measure pure performance & power-efficiency testing SPECpower workload Server side Java workload Scalable & multi-threaded Portable across a wide range of operating environments Minimal cost to run time & money Primary reported metrics Productivity - transactions per power consumed (overall ssj_ops/watt) Throughput - power consumption for servers at different levels o Active Idle (%), 1%, 2%, 1% - Graduated workload levels Quantifying Trends in Server Power Usage 3
SPECpower Benchmark Configuration [2 of 7] SUT System running the benchmark workload Power Analyzer Measures amount of power consumed by the SUT CCS Drives benchmark workload Collects measurements Reference: [SPEC212A] Quantifying Trends in Server Power Usage 4
SPECpower Benchmark Workload [3 of 7] Workload 6 different transaction types New Order (3.3%) a new order is inserted into the system Payment (3.3%) record a customer payment Order Status (3.%) request the status of an existing order Delivery (3.%) process orders for delivery Stock Level (3.%) find recently ordered items with low stock levels Customer Report (3.3%) create a report of recent activity for a customer Server resources CPU intensive application Memory sufficient for JVM(s) Very little network I/O Does not write to disk as part of the measured workload Transactions modify in-memory data structures representing Warehouses, Customers, etc. Reference: [SPEC212B] [SPEC212C] Quantifying Trends in Server Power Usage 5
SPECpower Benchmark Graduated Workload [4 of 7] Calibrate workload for maximum throughput (1%) then scale down Report results at graduated load levels: 1, 9,, 1, (ssj_ops) Load level CPU utilization Reference: [SPEC212A] Quantifying Trends in Server Power Usage 6
SPECpower Benchmark Scope [5 of 7] Intended to test the following CPU, caches, memory, system architecture, JVM, some OS components Designed with a small I/O component (network & storage) What are the consequences? The benchmark can be scaled easily in terms of processor sockets Eliminate the expense of massive storage arrays Large number of results due to ease of benchmark evaluation Standardized test to measure power consumption at various performance levels Defines a methodology that other benchmarks can utilize to give them a consistent set of power consumption metrics Care must be taken when interpreting benchmark results due to limited workload scope Quantifying Trends in Server Power Usage 7
SPECpower Benchmark Publication History [6 of 7] 45 Publication Count 5 4 45 35 3 25 2 15 1 4 35 3 25 2 15 1 5 5 27-Q4 28-Q1 28-Q2 28-Q3 28-Q4 29-Q1 29-Q2 29-Q3 29-Q4 21-Q1 21-Q2 21-Q3 21-Q4 211-Q1 211-Q2 211-Q3 211-Q4 212-Q1 212-Q2 212-Q3 212-Q4 213-Q1 213-Q2 213-Q3 213-Q4 214-Q1 214-Q2 214-Q3 214-Q4 215-Q1 215-Q2 215-Q3 Count per Quarter Cumulative Count First result published in 27-Q4 Total of 484 results through 215-Q3 Fairly steady rate of results through 212 (red line) 8 years of results to examine Quantifying Trends in Server Power Usage 8
SPECpower Benchmark Publication Profile [7 of 7] Hardware Vendor Distribution (Percent of Total) Acer Incorporated Dell Inc. Fujitsu IBM Corporation Hewlett-Packard Huawei Technologies Co. Ltd Hitachi Ltd. Supermicro Computer Inc. NEC Corporation (Others < 2%) % 5% 1% 15% 2% 2 4 6 8 12 16 2 24 32 36 4 48 64 72 Core Count Distribution 2 4 6 8 1 12 Red Hat Linux Mac OS SUSE Linux Windows OS Family Distribution The breadth and depth of the industry are not represented Primarily focused on midrange Windows servers 1 2 3 4 5 Quantifying Trends in Server Power Usage 9
Sample - SPECpower Results [1 of 5] Benchmark Sponsor Dell Inc. PowerEdge C615 (AMD Opteron 4376HE, 2.6 GHz) Publication: Jan 9, 213 Benchmark Results Review Benchmark Results Summary System Under Test (SUT) Boot Firmware Settings System Under Test Notes Controller Measurement Devices Environmental Data Quantifying Trends in Server Power Usage 1
Sample - SPECpower Results Benchmark Results Summary [2 of 5] Performance & power usage at graduated load levels This is going to be our prime source of data for analysis Quantifying Trends in Server Power Usage 11
Sample - SPECpower Results System Under Test (SUT) [3 of 5] Details about the server(s) used in the benchmark Quantifying Trends in Server Power Usage 12
Sample - SPECpower Results Boot, Controller, & Measurement Devices [4 of 5] Server boot setting & supporting benchmark devices Quantifying Trends in Server Power Usage 13
Sample - SPECpower Results Temperature Calibration [5 of 5] Temperature data & calibration details Quantifying Trends in Server Power Usage 14
What can we learn from SPECpower results? What s the relationship between load level and power usage? Has productivity changed over the last 8 years? How has the range of server-level power changed? Do you get better transaction processing efficiency at higher throughput? Can you estimate a server s electricity cost over its lifetime? Quantifying Trends in Server Power Usage 15
Are Load Level & Power Usage related? Introduction [1 of 5] Question Is power usage related to load level and throughput? Background Does power usage increase linearly with load level and throughput? Can we use linear interpolation to predict power usage for an arbitrary load level? Select 4 results over the past 8 years and determine the relationship between load level, throughput and power usage Quantifying Trends in Server Power Usage 16
Are Load Level & Power Usage related? Samples from 27 & 21 High Correlation [2 of 5] Dec 27 Dec 21 35 35, 14 35, 3 3, 12 3, Power (W) 25 2 15 1 25, 2, 15, 1, Throughput (ssj_ops) Power (W) 1 8 6 4 25, 2, 15, 1, Throughput (ssj_ops) 5 C(tput,power) =.99 C(tput,loadlevel) =.99 5, 2 C(tput,power) =.99 C(tput,loadlevel) =.99 5, % 1% 2% 3% 4% 5% 6% 7% 8% % 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% 9% 1% Power (W) Tput (ssj_ops) Power (W) Tput (ssj_ops) Plot throughput & power usage for all load levels (%, 1%,, 1%) High correlation implies that there is a linear relationship between throughput and power usage Straight throughput line implies high correlation with load level Quantifying Trends in Server Power Usage 17
Are Load Level & Power Usage related? Samples from 212 & 215 High Correlation [3 of 5] Dec 212 Sept 215 35 1,8, 6 6,, Power (W) 3 25 2 15 1 5 8, 6, 4, C(tput,power) =.99 C(tput,loadlevel) =.99 C(tput,power) =.99 C(tput,loadlevel) =.99 1,6, 1,4, 1,2, 1,, 2, Throughput (ssj_ops) Power (W) 5 4 3 2 1 C(tput,power) =.99 C(tput,loadlevel) =.99 C(tput,power) =.99 C(tput,loadlevel) =.99 5,, 4,, 3,, 2,, 1,, Throughput (ssj_ops) % 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% % 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% Power (W) Tput (ssj_ops) Power (W) Tput (ssj_ops) Plot throughput & power usage for all load levels (%, 1%,, 1%) High correlation implies that there is a linear relationship between throughput and power usage Straight throughput line implies high correlation with load level Quantifying Trends in Server Power Usage 18
Are Load Level & Power Usage related? Results [4 of 5] Question Is power usage related to load level and throughput? Answer Yes High correlation between load level, throughput & power usage Can we use linear interpolation to predict power usage for an arbitrary load level? System starts at Active Idle (OS booted and stabilized, minimal utilization) As workload volume increases, throughput and power usage increase in a linear manner Hardware vendors have been working to reduce power consumed at Active Idle Power (W) 35 3 25 2 15 1 5 % 1% 2% 3% Dec 27 4% Power (W) 5% 6% 7% 8% 9% 1% Tput (ssj_ops) 35, 3, 25, 2, 15, 1, 5, Throughput (ssj_ops) Quantifying Trends in Server Power Usage 19
Are Load Level & Power Usage related? Estimating power usage at any given load [5 of 5] 6 5 4 3 2 1 % 1% 2% 3% 4% 5% 6% 7% Power (W) 8% 9% 1% How can you estimate power usage for a given load level? Assume you are given two points: o Max Power (MaxW) o Active Idle (IdleW) Create the line through those two points Use the line to estimate power usage Assumption: Target Load is a close approximation to server CPU Capacity Utilization ( ) Util% = MaxW IdleW IdleW 1 ( ) 7 = 573 16 16 1 Power@Util % + Power@7% + Power@7% = Average Active Power (W) Estimated Power (W) 433 Quantifying Trends in Server Power Usage 2
Productivity Introduction [1 of 3] Question Has SPECpower s productivity metric changed over the last 8 years? Background Primary benchmark result is Overall ssj_ops / power This is a productivity metric that quantifies the amount of work performed to the number of watts consumed Quantifying Trends in Server Power Usage 21
Productivity Results [2 of 3] 12, 1, 8, Linear Trend Line R 2 =.76 Server Side Java Operations per Watt ( ssj_ops/ power) 6, 4, 2, 27-12 28-12 29-12 21-12 211-12 212-12 213-12 214-12 215-12 Productivity has improved over the lifetime of the benchmark We can definitely do more work today with less power Quantifying Trends in Server Power Usage 22
Productivity Results [3 of 3] 12, 1, 8, Server Side Java Operations per Watt ( ssj_ops/ power) Polynomial Trend Line R 2 =.81 6, 4, 2, 27-12 28-12 29-12 21-12 211-12 212-12 213-12 214-12 215-12 Productivity has improved over the lifetime of the benchmark 27 started at approximately 1,, we are now 1x at 1, in 215 We can definitely do more work today with less power Quantifying Trends in Server Power Usage 23
Active Idle Introduction [1 of 4] Load Level Watts Sep 215 Watts Dec 27 % 45 187 1% 83 197 2% 97 27 3% 11 215 4% 125 224 5% 141 231 6% 16 238 7% 183 245 8% 28 25 9% 237 256 1% 266 26 Question Has Active Idle increased or decreased over the past 8 years? Why is this Active Idle important? Active Idle is a SPEC defined state Application is running, but no transactions are being processed System is ready to quickly respond to any incoming transactions Our data centers may have a large number of servers that are wasting power while doing very little work ($) What would we like to see? Minimal power consumption when a server is idle Decrease the ratio of Active Idle to Max Power - Dec 27 72% - Sep 215 17% Active Idle 1 Max Power Quantifying Trends in Server Power Usage 24
Active Idle vs. Maximum Power Why is this an important ratio? [2 of 4] Load Level Watts Sep 215 Watts Dec 27 % 45 187 1% 83 197 2% 97 27 3% 11 215 4% 125 224 5% 141 231 6% 16 238 7% 183 245 8% 28 25 9% 237 256 1% 266 26 Question Has the ratio of Active Idle to Max Power changed? SPECpower reports results at 2 extremes HIGH System running at maximum throughput LOW Throughput is zero (Active Idle) What would we like to see? We want Active Idle to be a small percent of Max Power Decrease the ratio of Active Idle to Max Power This means that our idle servers are using less power relative to the maximum power a server could use Quantifying Trends in Server Power Usage 25
Active Idle vs. Maximum Power Results Trend from 27 through 215 [3 of 4] 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% Ratio: Active Idle to Max Power Active Idle Max Power 1 % 27-12 28-12 29-12 21-12 211-12 212-12 213-12 214-12 215-12 Noticeable improvement in Active Idle (smaller percent of Max Power) Early results were in the 65% range Recent results in the 2% range Quantifying Trends in Server Power Usage 26
Active Idle vs. Maximum Power What have we learned? [4 of 4] Load Level Watts Sep 215 Watts Dec 27 % 45 187 1% 83 197 2% 97 27 3% 11 215 4% 125 224 5% 141 231 6% 16 238 7% 183 245 8% 28 25 9% 237 256 1% 266 26 Question Has the ratio of Active Idle to Max Power changed? What have we learned? The trend we see is Active Idle being a smaller percentage of maximum power consumption Less power consumed when a server is idle Steeper increase in power consumption as the server s load level increases Today s Active Idle is approximately 2% of Max Power Quantifying Trends in Server Power Usage 27
Processing Efficiency Introduction [1 of 4] Question Is processing efficiency related to load level? Background At each load level SPECpower reports the ratio ssj_ops Average Active Power What is the relationship between load level and Performance to Power Ratio? Can we process more transactions per Watt at high (or low) load levels? Examine 4 results from the past 8 years (W) Quantifying Trends in Server Power Usage 28
Processing Efficiency Samples from 27 & 21 [2 of 4] Performance to Power Ratio Dec 27 Performance to Power Ratio Dec 21 1,2 3, 1, 2,5 8 2, 6 1,5 4 1, 2 5 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% As load increases we have higher efficiency we are able to process more transactions per Watt of power consumed 27 results have an almost linear increase in efficiency across the entire range 21 results are less linear high & steady efficiency at +8% Quantifying Trends in Server Power Usage 29
Processing Efficiency Samples from 212 & 215 [3 of 4] Performance to Power Ratio Dec 212 Performance to Power Ratio Sept 215 6, 12, 5, 1, 4, 8, 3, 6, 2, 4, 1, 2, 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% 215 result shows a flattening of the curve at high utilization Once we reach 8%, we appear to have reached a plateau no increased efficiency at higher load levels Quantifying Trends in Server Power Usage 3
Processing Efficiency Results [4 of 4] Is processing efficiency related to load level? Yes As the load level increases, you are able to process more transactions per Watt consumed 12, Performance to Power Ratio Sept 215 What does this mean? If you have a well-defined and controlled workload like SPECpower, then you want to run your servers as hot as possible or do you? 1, 8, 6, 4, This observation may not apply to servers processing varying 2, workloads (which is the case for most of today s servers) 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% Quantifying Trends in Server Power Usage 31
Estimate Server Electricity Cost Introduction [1 of 2] Load Level Watts Sep 215 Watts Dec 27 % 45 187 1% 83 197 2% 97 27 3% 11 215 4% 125 224 5% 141 231 6% 16 238 7% 183 245 8% 28 25 9% 237 256 1% 266 26 Question Can you use SPECpower results to estimate the cost of running a server? Goal Estimate electricity cost for a server Contrast cost for old/new servers Assumptions Limit scope to server Server lifetime is 3 years. Average electricity cost per kwh is $.1 PUE of 1.7 (estimated for 214) Load level is approximately equal to CPU utilization Method 24hours 365 days = ( watts ) ( 3 years) day year 1kilohour dollars.1 PUE 1,hours kwh Quantifying Trends in Server Power Usage 32
Estimate Server Electricity Cost Results [2 of 2] $1,2 Electricity Cost Over Lifetime (Publication Date: Dec 27) $1,2 Electricity Cost Over Lifetime (Publication Date: Sep 215) $1, $1, $8 $8 $6 $6 $4 $4 $2 $2 $ $ Average CPU Utilization Average CPU Utilization Power cost may equal server cost over lifetime (3 years) Older servers used significantly more power at low utilization levels Larger range for new servers due to decrease in Active Idle Quantifying Trends in Server Power Usage 33
SPECpower Summary Questions & Answers [1 of 2] Is power usage related to load level? Yes, there is a linear relationship (correlation.99) Power usage increases linearly with load level Use load level to estimate power usage Power (W) 6 5 4 3 2 1 Sept 215 6,, 5,, 4,, 3,, 2,, 1,, Throughput (ssj_ops) % 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% Power (W) Tput (ssj_ops) Has productivity changed over the last 5 years? Yes, productivity has improved We can do more work today with less power Increase from approximately 5 in 27 to approximately 4,5 in 213 Are today s servers more energy efficient at low load levels? Yes, Active Idle is a smaller fraction of maximum power In 27 Active Idle was about 65% of Max Power, today it is closer to 2% Less power consumed when a server is lightly loaded 12, 1, 8, 6, 4, 2, 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% Server Side Java Operations per Watt (ssj_ops/watt) 27-12 28-12 29-12 21-12 211-12 212-12 213-12 214-12 215-12 Ratio: Active Idle to Max Power % 27-12 28-12 29-12 21-12 211-12 212-12 213-12 214-12 215-12 Quantifying Trends in Server Power Usage 34
SPECpower Summary Questions & Answers [2 of 2] Is processing efficiency related to load level? Yes, as the load level increases, you are able to process more transactions per Watt consumed Newer results seem to have a plateau around 8% This seems to imply that you want to run your servers as hot as possible, but Increased efficiency limited to benchmark workload Can you use benchmark results to estimate the electricity cost for a server? Yes, you can estimate cost for a server at various load levels. Take advantage of the linear relationship between load level and power consumption Power usage is mostly due to CPU usage = 12, 1, 8, 6, 4, 2, 24hours day Performance to Power Ratio Sept 215 1% 2% 3% 4% 5% 6% 7% 8% 9% 1% 365 days year ( watts ) ( 3 years) 1kilohour dollars.1 PUE 1, hours kwh Quantifying Trends in Server Power Usage 35
Quantifying Trends in Server Power Usage Summary Positive results from SPECpower Results are reported at graduated load levels o A number of benchmarks only report maximum throughput o SPECpower gives you an idea of performance over a range Productivity reporting o ssj_ops/watt o Productivity metrics are the building blocks of successful capacity planning Is green making a difference? Yes Trends in SPECpower results indicate that today s servers are becoming more energy efficient o Productivity improvements (process more work per Watt) o Decrease in power usage at low utilization levels Quantifying Trends in Server Power Usage 36
Next Steps Ideas for extending our analysis Are there other trends that are visible if we partition our analysis by server architecture? Will cluster analysis show us sets of results that demonstrate similar behavior? Can we see any new relationships if we augment SPECpower results with an estimate of server capacity? Quantifying Trends in Server Power Usage 37
References [MILL211] Rich Miller, "Uptime Institute: The Average PUE is 1.8", 211, http://www.datacenterknowledge.com/archives/211/5/1/uptime-institute-theaverage-pue-is-1-8/ [SPEC212A] SPECpower_ssj28 - Design document - Control and Collection System, 212, www.spec.org [SPEC212B] SPECpower_ssj28 - User Guide, 212, www.spec.org [SPEC212C] SPECpower_ssj28 - Design Document - SSJ Workload, 212, www.spec.org [SPEL28] Amy Spellmann, Richard Gimarc and Charles Gimarc, Green Capacity Planning: Theory and Practice, CMG28 International Conference [TGG27] The Green Grid, The Green Grid Data Center Power Efficiency Metrics: PUE and DCiE, 27, www.thegreengrid.org [UPTM214] Uptime Institute, 214 Data Center Industry Survey, https://journal.uptimeinstitute.com/214-data-center-industry-survey/ Quantifying Trends in Server Power Usage 38