Hot vs Cold Energy Efficient Data Centers - SVLG Data Center Center Efficiency Summit KC Mares November 2014
The Great Debate about Hardware Inlet Temperature Feb 2003: RMI report on high-performance data centers recommended 70-90F inlet temp range Feb 2010: DOE Best Practices Guide for Energy- Efficient Data Center Design recommended 80.6F June 2011: Dell report on temperature sweet spot was 75-80F Typical server fans consume 8W at low speed and only 16W at high speed Low speed at 78F or below; high speed at 95F (35C) Intel suggests that most air-cooled platforms are designed to work at 35C inlet
Concerns with Higher Temperature Equipment failure rate increases with inlet temperature HDDs and fans have highest failure rates due to higher temps Degraded disc drive throughput due to vibration from fans IT equipment fans speed up and get noisy redundancy of drives and SSDs can reduce
Higher Densities Force Higher Inlet Temps Power trends in Watts/ft2 of Equipment (ASHRAE)
Benefits of Higher Supply Temperature Higher Delta T = higher cooling system efficiency and Capacity Reduced cooling system CapEx Reduced cooling system energy use can lower utility usage and demand charges Higher power densities reduce cooling redundancy and will force higher supply air temps using lower density cooling systems, so get used to it!
It s All Energy, so Focus is on Optimization With each 1F degree rise in supply air temps reduces HVAC energy by 2-3% Server fans = 1-4% and up to 15% of server load Typical mechanical loads are 25% of total data center electrical load Server fan speeds typically speed up at 78F If server leakage current rises at 38+C/100F, than maintain that, with 3-5C efficiency of heat exchange, that s still 90-95F inlet supply temp
Additional Way to Efficiency in Google Data Centers Economization: Water towers, outside air and water sources for cooling Servers use minimum amount of fan power Components designed to operate efficiently from idle to full capacity or usage Result: Data center fleet average PUE reduced from ~1.22 (2008) to 1.12 (Q2 2014)
Facebook Data Centers Initial Operating ranges: 18 C to 27 C and 65% max RH Revised Maximum Operating ranges: 29 C & 90% RH (depending upon location) Current design standard is to provide 85F supply for people comfort even though servers can handle 95F Result: Prineville full load PUE 1.07
Dell Study on Composite Server Power to Temperature
Dell Comparison of IT & Cooling Power
Total Power with IT Equipment circa 2007
Hardware Continues to Increase Inlet Temp Ideal range was 75-79 with circa 2007 IT equipment It appears that improved server thermal design may have shifted the sweet spot to just above 80F (27C) For economized cooling systems, the ideal temp would happen somewhere above 100F (38C) There really is not a sweet-spot temperature while economizing; just follow the ambient to the temperature your are comfortable with.
Why Add a Compressor Cooling System? At 27C, majority of data center locations would be economizing 90% of the time, so why add a cooling system for only 10% of the time? Trade off is $100/MW of IT load in energy vs $1,500 MW in CapEx of cooling system
Examples of CapEx Savings Case PUE Supply Air Supply Water Energy Costs Chiller Size First Time Cost C F C F ($0.05 / kwh) tons Chiller+Generator A: Air Cooled 1.19 20 68 7 44 $2,350,000 2,250 $4,500,000 B: Air Cooled 1.17 32 90 18 64 $2,320,000 0 0 C: Water Cooled 1.15 n/a n/a 8 46 $2,270,000 2,250 $4,500,000 D: Water Cooled 1.11 n/a n/a 18 64 $2,180,000 0 0 E: CRAY Units 1.17 20 68 7 44 $2,530,000 2,250 $4,500,000
Example of exhaust temperature rise
Hardware Supports Higher Temperatures Hardware manufactures balance higher CapEx with larger heat sinks and fans vs higher OpEx with higher fan horsepower Most manufacturers have settled at about 77F/25C as the balance point Hardware of today is better designed to for Temp ranges and higher supply air temps Most manufacturers support 35C or greater
Maximum temperature rating benchmarking survey results (ASHRAE)
Intel Data Center in a Box 900 servers 10 month test Servers operating at 90% utilization 1,000 SF trailer in Santa Fe, New Mexico ½ - Standard DX Cooling ½ - Outside air up to 90F No humidity control Low quality residential air filter RH Varied 4% to 90% Servers covered in a layer of dust Air-economized trailer = 74% reduction in cooling Air-side economizers 91 percent of the time Savings $2.87 million on cooling energy for a 10MW datacenter
Microsoft Data Center in a Tent Inside the tent, five HP DL585s (already past useful life) ran from Nov 2007 to June 2008 Results: Water dripped onto the rack and the server A windstorm blew a section of fence onto the rack An itinerant leaf was sucked onto the server fascia ZERO failures and 100% uptime
Focus on The Correct Goal ASHRAE is not server manufacturers ASHRAE creates guidelines, not hard rules Run your data center to provide your IT needs at the lowest TCO Above air inlet temperature 23 C, the speed of the air moving device increases in speed to maintain fairly constant component temperatures and in this case inlet temperature changes have little- tono effect on component temperatures and thereby no affect on reliability since component temperatures are not affected by ambient temperature changes. ASHRAE
Goals Minimize OpEx: Server load + mechanical load + accelerated hardware replacements due to thermal failures Lower CapEx by reducing cooling system costs Design to meet conditions with lowest TCO (OpEx and CapEx) Maintain satisfactory acoustic limits Maintain satisfactory hardware failure rates Reduce disc drive throughput degradation
Optimize Supply Temperatures What is ideal temp range that reduces TCO: CapEx and OpEx Balance IT and infrastructure loads for lowest power load Turn off chillers and run on water econ or air econ Have no compression-based chilling at all avoid CapEx Reduce compressor-based cooling can reduce vibration and thus extend hardware life and disc through-put
Solutions Throttle down or turn off hardware when temps are warm The probability of failure of redundant data centers simultaneous with one throttled down is very low Use carbon racks instead of metal to reduce vibration Run inlet supply temps at or just below server fanspeed threshold most of the time Determine time over threshold and calculate TCO and IT capacity at higher temps and design for optimum
Solutions Monitor, evaluate, calculate, optimize and repeat continuously and autonomously Ambient conditions, workloads and IT hardware are all dynamic Remove either the server fans or the data center fans Servers have more fan horsepower than needed Reduce vibration by locating AC systems outside of IT rooms and use lower power draw and lower fan speed systems Adjust air supply temps to keep fan speeds and cooling systems running at optimum efficiency
KC Mares kcmares@megawattconsulting.com www.megawattconsulting.com #KCMares Thank you