DCIM Data Center Infrastructure Management Part 2 Implementation Challenges & Strategic Advantages - Predictive Analysis August 8, 2013 Presented by Julius Neudorfer Sponsored by
Today s Topics Implementation Challenges! Strategic Advantages of DCIM Predictive and Proactive maintenance analysis to improve overall availability Capacity planning to improve operational and energy efficiency
Implementation Challenges Environmental Sensor installation Temperature Humidity-Airflow-Static Pressure Cabled or Wireless Energy Monitoring- Level of Granularity UPS (power measurement may be built-in not energy) Power Distribution PDU, Rack, IT Device Cooling CRAH, Chillers, Pumps, Fans
Facilities Monitoring Points
Implementation Retrofit Issues Electrical Panels Energy Monitoring Requires Adding Current Sensors (CT) and Voltage Sensors (PT) to Electrical Panels Live Hot Work not permitted by OSHA May Require Shutdown of Panel (and load!) Adding Intelligent Rack PDUs Typically Plug-in to existing receptacles Require IT Power down, unless PS Redundant A-B
Existing System Integration Existing BMS and IT systems may already have a wide range of sensors and can provide some of the data Integration of these systems can avoid installing duplicate sensors However, while it seems simple on face value, it is sometimes far easier said then done
Communications IT Equipment TCP/IP (IT Network Standards) SNMP Simple Network Management Protocol IPMI Intelligent Platform Management Interface (INTEL) Facilities Building Management Systems (BMS) BACNet (ASHRAE) Modbus (dedicated serial wiring) LON Network unshielded twisted pair (UTP) cable BACNet-Modbus/TCP (over UDP) Interoperability and Compatibility Issues Vendor specific and propriety
Communications Wireless Ethernet 802.x (WiFi: A-B-G-N, etc.) Zigbee Mesh Network Vendor Proprietary Dedicated Hardwire (BacNet-Modbus) Ethernet: Separate management network or over production network Vendor interoperability IT MIBs Facilities Modbus Register Mapping
Frequency of Data Gathering Real-Time (1-60 seconds) Frequency: 1, 5, 15 or 60 Minutes Trending: Hourly, Daily, Weekly, Monthly Frequency of Polling increases data base size Number of Monitoring Points (devices, nodes) can impact size, speed and cost of system
Challenges Multi-Vendor Integration The ability to accept or poll data from different vendor equipment or monitoring of existing Facilities Systems: Power/Energy UPS Floor Level Distribution: PDU/Wall Panel Branch Circuit Cooling Systems Cooling Systems (Temperature - ΔT & Humidity) CRAC, CRAH, Chillers, Pumps, Cooling Towers External Conditions: Temperature and Humidity Room Environmental Conditions: Cold Aisle & Hot Aisle At each Rack - via IT systems
Challenges - Integration of IT Equipment The ability to accept or poll data from different vendor equipment or monitoring of existing IT Systems: Power Draw (Per Rack or Device) Servers, Storage, Switches Utilization CPU, Disk Space, Bandwidth, Etc. Cooling Intake Temperature & Exhaust Temperature (ΔT)
Strategic Advantages of DCIM Predictive and proactive maintenance analysis to improve overall availability Capacity planning to improve operational and energy efficiency
Potential Benefits Facility Power Provisioning Branch Circuits Receptacle Types Cooling Airflow Optimization Match to Rack Level Heat Load Less Hot Spots Less Overcooling Improved Energy Efficiency Proactive Monitoring & Preventive Maintenance Detect Performance Changes
Real-time Monitoring Allows Proactive Maintenance Basic Facility Power (kva/kw Real-Time/Instantaneous) Energy (KWH Power over Time) - PUE Temperature & Humidity Actionable: Baseline > Alarm set points Trends > Early Warning Proactive Maintenance Resource and Energy Optimization
Electrical Energy Measurement
Improved Availability Proactive Maintenance Early detection reduces chances of system or component level failure by of performance degradation Preemptive service rather than reactive break-fix or simple scheduled periodic intervals
Baseline and Trends Provide Early Problem Detection 300 250 200 Proactive Maintenance IT Load Cooling Energy Total Chiller 150 CRAH-Fans CRAH-Humidity 100 Pumps-CW Condenser Fans 50?????? Pumps-Condenser 0 Jan Feb Mar Apr May June July Aug Sept Oct Nov Dec
Potential Benefits IT Equipment typically refreshes every 3 years Planning Simulation and Modeling Capacity Management Better Provisioning Moves, Adds and Changes
Cohesive Alignment of IT and Facilities IT Asset Management Rack-Row Provisioning: Power Cooling Networking Better Facilities Resource Allocation
Benefits of Predictive Modeling Total Cost of Ownership TCO Energy Saving Better System Management under varying conditions Model Changes > Implement > Review Outcome Labor saving Reduces or eliminates manual power usage surveys Reduces or eliminates manual temperature surveys Avoid Human Error and manual data center (Spreadsheets) Result - Reduced TCO
Improved Availability Predictive Analysis Cooling systems Avoids overstressing some cooling system components by potential overloading or imbalanced due to poorly distributed heat loads Avoids or minimizes creating potential future hot spots as IT equipment is added which could reduce equipment reliability
Improved Availability Predictive Analysis IT Equipment Insures proper airflow under dynamic IT loading conditions Avoids or minimizes creating potential future hot spots as IT equipment is added which could reduce IT equipment reliability Cooling Failure Scenarios Impact on IT equipment Airflow and temperature visualization during cooling unit failure
Predictive Modeling CRAC Failure Failed Simplified example for clarity
Stranded Capacity Stranded capacity is inability to fully utilize the entire design capacity of the: Entire data center Row Rack This is due to a less than optimal allocation of the primary resources due many variables, but perhaps caused by a lack of proper planning
Avoid Stranded Capacity Caused by Mismatch Apparent Capacity - Facility Level Space Total / % Available Power Total / % Available Cooling Total / % Available Usable Capacity - Row/Rack Level Space Power Cooling Space Power Cooling
IT Capacity Planning Tool Predictive Modeling can provide the ability to determine if and where new IT equipment can be deployed Rack Level Space U Total/Available Power Total/Available Cooling Total/Available Network Ports Total/Available
Causes of Stranded Capacity Hidden Mismatch of Cooling Requirements Average Density vs Rack Density Space Density Example: Bladeserver = 5kW Power Cooling Space: 10,000 sq/ft Power: 1,000 kw critical load design (UPS) Cooling: 1,000 kw (N+2) 12 x 30 Ton CRACs Density: 100 Watts Sq/Ft vs Watts per Rack
CFD Modeling for Predictive Analysis Airflow Management is one of the biggest issues when implementing adding or changing IT equipment Especially when moving toward higher densities or dynamic heat loads caused by virtualized IT Can help avoid stranded capacity issues due to Heat Density sometimes caused by less than optimal initial placement of IT equipment that is difficult to move once operational
CFD Modeling for Predictive Analysis Rack Density! Load 50% of Total Power Average Density: 100 Watts Sq/Ft Layout 330 Racks = 3 kw per rack (avg) 3 Bladeservers/rack @5kW each = 15 kw
Before beginning DCIM project Be clear about your problems, and your expectations of the vendor deliverables Ensure your DCIM vendor can integrate with existing BMS or other monitoring systems Understand and plan for sensor installation challenges - (i.e. downtime) Resource allocation - Vendors require your cooperation and coordination!
DCIM is not just hardware and software It is a philosophical commitment to a holistic approach by Facilities and IT to work together to improve the overall energy efficiency, operations and availability of the data center
The Bottom Line There are multiple benefits to a successful DCIM project, some are directly cost justifiable (i.e. improved energy and operational efficiency), while others are less tangible, such as improved equipment deployment and potential increase of availability DCIM is a long term investment of both capital, time and labor which can run into the millions. It begs the ROI question, how long will it take to recover the cost
No Trees (virtual or real) were hurt or destroyed in the preparation of this presentation Thank you Julius Neudorfer Sponsored by
August 8, 2013 Datacenter Clarity LC Jay Hendrix Siemens AG 2013 All rights reserved. usa.siemens.com/datacenters
Datacenter Clarity LC High definition asset visualization and analytics Collaboration and process management Real-time monitoring, alarm and event notification DCIM Computational Fluid Dynamics (CFD) Key performance indicator dashboard Infrastructure life cycle management Open interface and protocol support
Datacenter Clarity LC Powered by Siemens PLM software Smarter decisions Optimized efficiencies Proven technologies Future proof architecture
Contact Siemens Infrastructure & Cities, Inc. Infrastructure and Cities Sector 1000 Deerfield Parkway Buffalo Grove, IL 60089 www.usa.siemens.com/datacenters Jay Hendrix Building Technologies Head of Data Center Solutions North America Jay.hendrix@siemens.com