PowerExecutive IBM Agenda Why PowerExecutive - The Data Center Power/Cooling Crisis Fundamentals of PowerExecutive 1
The Data Center Power/Cooling Crisis Customers want more IT processing cycles to run their business IT Equipment packed together until power/cooling physical are reached Regardless of the technology being more or less power hungry The heat load of future systems projected to be >40 kilowatts per rack. For 40+ years providing power/cooling had not stymied data center growth Today, given the high power/cooling loads in IT equipment power/cooling is a concern The power/thermal problem in Data Centers is growing Most Data Centers are experiencing some sort of power/cooling problem. The degree of the problem varies widely, it is worse in older, smaller, and cramped Data Center s. At some point will performance be limited by the power/cooling capabilities of the Data Center? Data Center operators claim they are out of Power/Cooling Cannot accept more IT Equipment or cannot fully populate existing racks 2
Why Historically, IT vendors have provided few tools help customers understand and evaluate power/cooling consumption of IT equipment PowerExecutive the first IT based power management tool PowerExecutive will enable administrators to control, manage and optimize power at the system level Managers can now determine how much power IT equipment is consuming PowerExecutive uses a combination of hardware, firmware, BIOS and systems management software Questions you customers might ask Q: Is my data center really out of power and cooling? Q: How much power/cooling is stranded in my data center? Q: Am I converting all of my available power/cooling into compute cycles? 3
Three Fundamentals of Power Management 1. Measure/Trend Power Consumption - All System-X Servers rack and blade servers have a power meter - IBM Director provides the ability to trend power/temps for long periods of tiime 2. Cap or Allocate Power Correctly - Prior to PowerExecutive - allocation based on power supply label - Power consumed is a function of PS/Regulator efficiencies HW options (CPUs, Memory, HDD etc) SW (Hypervisor,OS, Apps ) loading - Allocate power based on past history using power measurements: to match the need of each server to match the P/T limits of the data center 3. Reduce power consumed - CPUs can reduce power in periods of low utilization using DBS and PowerNow - Save power costs 4
Three Fundamentals of Power Management will provide - A view of power consumption across your IT equipment over hrs, weeks, months - Using your applications and workloads! - Reducing your power/thermal requirements Consume available power/cooling before investing in additional infrastructure costs ex.hvac, UPS, Generators - Reducing power consumption during periods of low utilization (saving power costs - ex Util. company) PowerExecutive Demo <here> - The PowerExecutive software as a manager in the IP network - Connecting to IT equipment - Runs as either a stand-alone application or an IBM Systems Director plug-in 5
Measure/Trend Power Consumption Power meter in every server (Racks and Blades) No external equipment for new servers Continuously report power (watts) consumption - Also translates into thermal load place on Data Center Each server capable of reporting this information to - No guessing how much power is being consumed and when - Average Intel architecture server runs 5-10% utilized - Virtualization expected to increase that number (on average) System-X Blades & Rack servers contain this power meter hardware today! Trend power consumption How much power did I use between 8am and 6pm M-F? On the weekend? Single Server or Group of Servers Representing a Chassis, Rack, Island or the Data Center itself Trend Ambient and Exhaust Heat Index Temperatures Export data into spreadsheet, html Understanding your power consumption - How close is this you to the P/T limits of the data center? 6
Cap and Allocate Power Consumption Label Power is a poor measure of maximum power consumption of a server Does not take into account application utilization Label power typically over-provisions AC power and air-con to a server Rack servers vary widely in options and power consumption allows the Admin to set a servers maximum power consumption Historical (past) power consumption is a very good indicator of future consumption By using the PowerExecutive power trending graphs you can easily determine the proper power allocation od a server Administrators could set maximums based on how close they want to be to historical maximums Point: Allocate power where it is needed Sometimes max power will be needed fully configured, Hi Priority systems Often times less than max is needed minimum configured or Low priority systems Animation slide later in presentation 7
Reduce Power Consumed CPUs in a server consume a great deal of power - Slower, lower frequency CPUs consume less power, Faster CPUs consume more power - The voltage the CPU runs at is a major factor in power consumption CPU P-State controls allow the CPU to run at reduced voltage and frequency levels - Saving power - Reducing performance CPU throttling has secondary effects on other components in the server (ex. Mem, I/O) Point: Power is saved when the OS reduces the P-state of the CPU during periods of low activity 8
Simple Example Set Power Cap to Peak Simulated graph of actual power consumed by the server over time. Today, label power is the only option within the server Label Power Power onfigurator Set Power Cap Power typically allocated to a server Improvement over label power Based on measured power Over Allocated Power not converted into compute cycles Allocation Model of Server Wasted Power Allocation Power budget not converted into compute cycles Power (watts) Power Configuration -Planning estimate -Based on typical HW power consumption Proper Power Allocation Power budget converted into compute cycles Time Power Exec Trending (weeks, months) 9
Simple Example Getting More Out of Your Data Center Using PowerExecutive Determine proper power allocation for each server Reallocate power to additional servers without additional Power/Cooling equipment Measure and trend power Label Power Over Allocated Power (watts) Time Power Exec Trending (weeks, months) Upper bound on P/T for Rack Single Server Server X Power Trending indicates over-allocated power (red bar) Rack Allocation Before PowerExecutive Server 7 Server 6 Server 5 Server 4 Server 3 Server 2 Server 1 2 Additional servers in same Power/Cooling envelop Rack Allocation After PowerExecutive +2 additional servers Server B Server A Server 7 Server 6 Server 5 Server 4 Server 3 Server 2 Server 1 10
PowerExecutive in action! Manage Power at the rack and server level Compare actual vs. name plate power at system level View inlet and exhaust temperature Track heat emitted Compare rack actual power vs. Label Power Trend power use over time Trend temperature over time 11