Energy-Efficient Workload Placement in Enterprise Datacenters

Similar documents
Efficient Distributed File System (EDFS)

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Virtual Machine Migration based on Trust Measurement of Computer Node

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

Mathematics 256 a course in differential equations for engineering students

Application of Improved Fish Swarm Algorithm in Cloud Computing Resource Scheduling

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

Simulation Based Analysis of FAST TCP using OMNET++

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

An Optimal Algorithm for Prufer Codes *

Cluster Analysis of Electrical Behavior

Research of Dynamic Access to Cloud Database Based on Improved Pheromone Algorithm

Load Balancing for Hex-Cell Interconnection Network

Solution Brief: Creating a Secure Base in a Virtual World

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Achieving Energy Proportionality In Server Clusters

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment

Optimizing Document Scoring for Query Retrieval

ELEC 377 Operating Systems. Week 6 Class 3

CHAPTER 2 PROPOSED IMPROVED PARTICLE SWARM OPTIMIZATION

Video Proxy System for a Large-scale VOD System (DINA)

VRT012 User s guide V0.1. Address: Žirmūnų g. 27, Vilnius LT-09105, Phone: (370-5) , Fax: (370-5) ,

Distributed Middlebox Placement Based on Potential Game

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Dynamic Bandwidth Provisioning with Fairness and Revenue Considerations for Broadband Wireless Communication

A Model Based on Multi-agent for Dynamic Bandwidth Allocation in Networks Guang LU, Jian-Wen QI

CACHE MEMORY DESIGN FOR INTERNET PROCESSORS

Game Based Virtual Bandwidth Allocation for Virtual Networks in Data Centers

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

The Codesign Challenge

A Binarization Algorithm specialized on Document Images and Photos

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Network Coding as a Dynamical System

Two-Stage Data Distribution for Distributed Surveillance Video Processing with Hybrid Storage Architecture

DESIGNING TRANSMISSION SCHEDULES FOR WIRELESS AD HOC NETWORKS TO MAXIMIZE NETWORK THROUGHPUT

Comparison of Heuristics for Scheduling Independent Tasks on Heterogeneous Distributed Environments

Performance Evaluation of Information Retrieval Systems

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Fast Computation of Shortest Path for Visiting Segments in the Plane

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

Towards High Fidelity Network Emulation

CS 534: Computer Vision Model Fitting

A Dynamic Feedback-based Load Balancing Methodology

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

THere are increasing interests and use of mobile ad hoc

Related-Mode Attacks on CTR Encryption Mode

Assembler. Building a Modern Computer From First Principles.

Energy Aware Virtual Machine Migration Techniques for Cloud Environment

AADL : about scheduling analysis

A QoS-aware Scheduling Scheme for Software-Defined Storage Oriented iscsi Target

A Semi-Distributed Load Balancing Architecture and Algorithm for Heterogeneous Wireless Networks

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Feature Reduction and Selection

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Joint Energy Scheduling and Water Saving in Geo-Distributed Mixed-Use Buildings

Advanced Computer Networks

Channel 0. Channel 1 Channel 2. Channel 3 Channel 4. Channel 5 Channel 6 Channel 7

Wavefront Reconstructor

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Multi-objective Virtual Machine Placement for Load Balancing

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Self-Tuning, Bandwidth-Aware Monitoring for Dynamic Data Streams

Analysis of Collaborative Distributed Admission Control in x Networks

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

DiBA: Distributed Power Budget Allocation for Large-Scale Computing Clusters

Optimized caching in systems with heterogeneous client populations

Outline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems

Cost-efficient deployment of distributed software services

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Virtual Machine Placement Based on the VM Performance Models in Cloud

A New Token Allocation Algorithm for TCP Traffic in Diffserv Network

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Maintaining temporal validity of real-time data on non-continuously executing resources

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

X- Chart Using ANOM Approach

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Application of VCG in Replica Placement Strategy of Cloud Storage

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

TripS: Automated Multi-tiered Data Placement in a Geo-distributed Cloud Environment

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

MobileGrid: Capacity-aware Topology Control in Mobile Ad Hoc Networks

Decision Support for the Dynamic Reconfiguration of Machine Layout and Part Routing in Cellular Manufacturing

Self-Tuning, Bandwidth-Aware Monitoring for Dynamic Data Streams

Load-Balanced Anycast Routing

A Genetic Algorithm Based Dynamic Load Balancing Scheme for Heterogeneous Distributed Systems

If you miss a key. Chapter 6: Demand Paging Source:

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Classifier Selection Based on Data Complexity Measures *

Parallelization of a Series of Extreme Learning Machine Algorithms Based on Spark

An Investigation into Server Parameter Selection for Hierarchical Fixed Priority Pre-emptive Systems

RAP. Speed/RAP/CODA. Real-time Systems. Modeling the sensor networks. Real-time Systems. Modeling the sensor networks. Real-time systems:

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Transcription:

COVER FEATURE CLOUD COMPUTING Energy-Effcent Workload Placement n Enterprse Datacenters Quan Zhang and Wesong Sh, Wayne State Unversty Power loss from an unnterruptble power supply can account for 15 percent of a datacenter s energy. A rack-level power model that relates IT workload and ts power dsspaton allows optmzed workload placement that can save a datacenter roughly $1.4 mllon n annual energy costs. R sng electrcty costs are makng energy effcency a crtcal concern for datacenters, whch devour massve amounts of energy annually. Accordng to the Natural Resources Defense Councl, US datacenters expended 91 bllon kwh of electrcty n 2013, wth a proected ncrease to 140 bllon kwh annually by 2020 (www.nrdc.org/energy/data-center-effcency-assessment.asp). At these consumpton rates, energy costs wll contnue to be a maor contrbutor to a datacenter s total cost of ownershp (TCO). To mprove datacenters energy effcency, researchers and practtoners have focused on reducng IT equpment power consumpton, whch s typcally 30 percent of a datacenter s energy cost. The most popular approaches apply dynamc voltage/frequency scalng (DVFS) to reduce power dsspaton from the CPU and memory subsystem;1,2 consoldate servers by assgnng tasks to fewer servers and shuttng down dle ones;3,4 or evenly allocate workloads among servers through load balancng.5,6 Other approaches are based on hardware-resource 46 CO M PUTE R P U B LISH ED BY TH E I EEE COMP UTER SOCI E T Y use, such as subsystem power models for specfc computer components,7 9 and system power models for nonvrtualzed and vrtualzed envronments.10 12 Reducng IT equpment power consumpton certanly has mert, but these strateges gnore another large contrbutor to energy cost: power losses from an unnterruptble power supply (UPS), whch account for an addtonal 15 percent of overall energy cost.13 To address ths area, we created a rack-level power model that maps workload drectly to ts power dsspaton and formulated a mathematcal soluton that chooses an optmal workload allocaton to mnmze IT equpment power consumpton and power loss from UPSs. Usng a TCO model, we then analy zed potental electrcty cost savngs. Our expermental results show that the rack-level power model precsely matches measured power, wth an error rate of ±2.5 percent or less. For a datacenter that hosts 50 racks (1,000 servers) wth 10 applcatons, our smulaton showed a potental power savngs of up to 5.2 percent relatve to that wth a unform workload allocaton. Ths percentage translates to $1.4 mllon n annual 0018-9162/16/$33.00 2016 IEEE

Electrcty grd ATS UPSs PDUs Racks TABLE 1. Unnterruptble power supply (UPS) power loss wth two workload dstrbutons. Workload dstrbuton 1 Workload dstrbuton 2 Backup power generator UPS confguraton Loaded capacty (%) Power loss (W) Total loss (W) Loaded capacty (%) Power loss (W) Total loss (W) FIGURE 1. Smplfed power flow n a typcal datacenter. At the hghest layer, the utlty power and backup power, such as a desel generator, pass through unnterruptble power supples (UPSs) through an automatc transform swtch (ATS) and go through power dstrbuton unts (PDUs) to dfferent racks. energy cost savngs for a 76-MW datacenter wth a power- usage effectveness (PUE) of 1.7. UPS AND ENERGY USE Fgure 1 shows a smplfed power flow n a typcal datacenter. From the racks, power s dstrbuted through strps to ndvdual servers, all of whch have ther own power supples. Because a UPS represents a sngle falure pont, datacenters often use redundant UPSs n both centralzed and dstrbuted topologes to ensure that hosted servces are always avalable. A redundant confguraton can be a sngle N system, comprsng one UPS module, or multple N systems, comprsng parallel modules whose capactes are matched to the crtcal-load proecton. A centralzed topology typcally deploys UPSs at the faclty level; a dstrbuted topology deploys them at the rack or server level. 14 The choce of confguraton depends on a datacenter s falure frequency. Two popular redundant confguratons are parallel, or N+1, and system-plus-system, or. An N+1 redundant confguraton conssts of parallel, same-sze UPS modules, and the spare power s at least equal to the crtcal- load capacty. A redundant confguraton the most relable and expensve desgn can tolerate every concevable sngle falure pont. N+1 Rack 1 87.50 1,094 74.30 1,026 2,047 Rack 2 54.50 952 67.75 1,053 Rack 1 43.75 1,863 37.15 1,723 3,432 Rack 2 27.25 1,569 33.88 1,684 Each confguraton has a unque power-loss behavor. In an N+1 confguraton, power loss decreases when the IT power load ncreases; n a confguraton, power loss ncreases as IT power load ncreases. Thus, for a rack-level UPS confguraton, nether fewer servers runnng at full speed nor more servers runnng slower wth unform workload dstrbuton wll always save power because lowerng UPS output load leads to lower converson effcency. Ths observaton about power-loss behavor was foundatonal to our work. Enterprse datacenters generally run fewer applcatons sometmes only one across the entre datacenter. Google s datacenters, for example, run Web 2.0 and software as a servce (SaaS). In these cases, a sngle datacenter has a large workload, no vrtualzaton, and tens of thousands of physcal servers. Thus, workload placement s crtcal n separatng the often mllons of user requests across racks. ANALYZING UPS POWER LOSS In a double-converson UPS, power loss occurs when power transforms from AC to DC for battery storage and agan from DC to AC for delvery to racks and servers. Power loss s also ted to UPS topology. Our focus s on power loss n a racklevel dstrbuted UPS topology, where power loss depends on UPS effcency and redundant confguraton choce, and loaded capacty (real-tme power 2,079 3,407 load) depends on IT workload. For an N+1 confguraton, loaded capacty vares from 0 to 100 percent; for a confguraton, maxmum loaded capacty s only 50 percent, as the total power load s evenly allocated to two UPSs. UPS effcency depends on the technology used. To gather evdence that optmal workload dstrbutons for varous UPS confguratons dffer, we looked at data from two racks n the Wayne State Unversty datacenter; one rack had 20 fully loaded servers and the other had 20 dle servers. We then collected data from two workload dstrbutons: dstrbuton 1 had 20 fully loaded servers on the same rack and all dle servers on the other rack; and dstrbuton 2 had 12 fully loaded servers and 8 dle servers on one rack and the remanng mx of 8 loaded and 12 dle servers on the other rack. Table 1 shows the UPS power losses of dfferent UPS confguratons and workload dstrbutons. For the N+1 confguraton, dstrbuton type 1 has lower power losses; for the confguraton, dstrbuton 2 has lower power losses. Fgure 2 shows a UPS effcency curve based on data we collected from our two workload dstrbutons. Typcally, lower UPS effcency leads to hgher UPS power losses, but n datacenters, IT equpment dctates loaded capacty and thus UPS power loss. FEBRUARY 2016 47

CLOUD COMPUTING Effcency (%) 100 90 80 70 60 50 40 30 20 10 0 0 Gven the UPS effcency curve n Fgure 2, we used a natural logarthmc functon to ft the curve and Mathematca to calculate the UPS power loss. Fgure 3 shows the power loss of UPSs wth N+1 and UPS confguratons. MODELING ENERGY- EFFICIENT PLACEMENT On the bass of the data n Table 1, we formulated an optmzaton problem to mnmze the total power of IT equpment and UPS power loss through the 10 20 30 40 50 60 70 80 90 100 UPS load (% of full power ratng) FIGURE 2. UPS effcency curve based on data collected from Wayne State Unversty s datacenter. To fnd the relatonshp between IT equpment power and UPS power loss, we used a UPS wth a power ratng of 8 kw for a rack wth 20 servers. All servers had the same measured peak power of 350 W. The dle power of 20 servers was 4,366 W. Power loss (W) 2,500 2,000 1,500 1,000 0.0 0.2 0.4 0.6 0.8 1.0 N + 1 UPS loaded capacty FIGURE 3. UPS power loss of a sngle rack wth N+1 and confguratons. For the N+1 confguraton, power loss ncreases when loaded capacty s less than 50 percent and decreases when t s hgher than 50 percent. For the confguraton, power loss contnuously ncreases wth loaded capacty. use of a rack-level power model that drectly maps the rack s work load to ts power dsspaton. We used the model along wth our workload-placement calculatons to solve the optmzaton problem. Rack-level power modelng Our rack-level power model uses workload nformaton, such as throughput and nstructons per second (IPC), as drect nputs. The model s target applcaton s an enterprse data center wth nonvrtualzed servers, each of whch hosts only one applcaton. We assume that the CPU s runnng at a fxed speed wthout dynamc tunng. We express the rack-level power model as IDLE P ( w)= P + α w, (1) where P IDLE s the rack s dle power and the summaton of α w s the total power ntroduced by all workloads on ths rack. α s the coeffcent that represents watts per performance of workload on the th rack. α has dfferent unts for dfferent applcatons and hardware. For CPU-ntensve applcatons, α could be watts per nstructon; for memory-ntensve applcatons, t could be watts per byte; and for Web servces, watts per request mght be a sold ndcator of system effcency. Workload proflng provdes hstorc knowledge that can be used to choose the approprate α metrc. In our experments, we profled an applcaton n four steps. We frst measured the dle power of the th rack as P IDLE. We then fully loaded the rack to get a performance upper bound for ths applcaton. As a thrd step, we gradually ncreased the workload, makng the rack run at dfferent power levels, and recorded the rack power. Fnally, we calculated the average value (performance per watts) of all sample ponts, whch we used as α. We repeated ths process for dfferent types of applcatons to get the correspondng α value for the th rack. Optmzaton problem Our optmzaton problem was for a datacenter that hosts multple applcatons 48 COMPUTER WWW.COMPUTER.ORG/COMPUTER

smultaneously, wth each server hostng only one applcaton at a tme, and a workload that can be dynamcally assgned to a dfferent server subset. In addton, we assumed that one UPS s connected to only one rack deployed n ether an N+1 or a redundant confguraton. Because UPS power loss vares sgnfcantly wth IT power load, clearly any workload change or revsed dstrbuton wll affect IT equpment power loss. Our goal was to mnmze both the total IT equpment power and wasted rack-level UPS power. We chose the optmal workload allocaton gven the equalty and nequalty constrants of where P s the power of the th rack and η(p ) s the converson effcency when the UPS has the IT power load of η(p ). Equaton 3 ensures that the performance requrement s satsfed for workload. In Equaton 4, C s the mark), whch smulates Web servce requests to read and wrte to a database. We ran the two applcatons separately to get α as descrbed n Equaton 1 and then estmated real-tme UPS POWER LOSS VARIES SIGNIFICANTLY WITH IT POWER LOAD, SO ANY WORK LOAD CHANGE OR REDISTRIBUTION WILL AFFECT IT EQUIPMENT POWER LOSS. performance, whch means the summaton of all racks workload should be equal to the total workload from all users; capacty, whch means the hardware resource requrement should be less than each rack s maxmum hardware capacty; and power, whch means the total rack power satsfes the specfed power-cappng requrement (by operator or hardware). Gven these constrants, the mathematcal formulaton of the optmzaton problem s Mnmze as long as w = w, P η ( P ) (2) (3) w C, and (4) C IDLE CAP P + α w P, (5) capacty lmtaton for workload on rack, and C s the hardware lmtaton for rack. In Equaton 5, P CAP s the cappng power for the th rack. The functon f[η(p )] denotes the relatonshp between the UPS output power load and ts correspondng converson effcency. η(p ) can be expressed as P η( P )= α ln + b, (6) P UPS where P UPS s the UPS nput power, and α and b are fxed to match the converson effcency curve for dfferent UPSs. In our evaluaton, we chose a value of 0.1279 for α and 0.9343 for b. EVALUATION RESULTS To verfy the power model, we conducted an experment wth 10 servers and two applcatons. The servers were eght Intel CPU servers and two AMD CPU servers; the applcatons were Y-Cruncher (www.numberworld.org/y -cruncher), a CPU-ntensve applcaton, and Yahoo Cloud Servng Benchmark (YCSB; http://labs.yahoo.com /news/yahoo-cloud-servng-bench power by runnng the two applcatons on all 10 machnes smultaneously whle randomly changng each applcaton s workload durng the test. The sample frequency s 1 Hz, whch s suffcent for tasks that must execute over many hours or even days. We conducted the test on Intel and AMD machnes separately and used lnear regresson to ft the dataponts. The P IDLE of 10 machnes was 2,183 W. The workload of Y-Cruncher and YCSB are represented n dgts per second and operatons per second. The coeffcents α of Y-Cruncher on Intel and AMD servers were 3 10 5 W/dgts/s and 4 10 5 W/dgts/s. The coeffcents α of YCSB on Intel and AMD servers were 0.0024 W/operatons/s and 0.0039 W /operatons/s. Fgure 4 shows the measured power by a power meter and the estmated power usng our rack-level power model. Error rates were wthn ±2.5 percent a correspondng powerestmaton error of less than 83 W. Moreover, our rack-level power model overestmated power consumpton 82 percent of the tme (246 out of 300 sample ponts). For under estmated cases, FEBRUARY 2016 49

CLOUD COMPUTING 3,550 3,500 Measured power Estmated power Error rate 15.00 13.00 3,450 11.00 3,400 9.00 Power (W) 3,350 3,300 7.00 5.00 % 3,250 3.00 3,200 1.00 3,150 1.00 3,100 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103 109 115 121 127 133 139 145 151 157 163 169 175 181 187 193 199 205 211 217 223 229 235 241 247 253 259 265 271 277 283 289 295 Tme (s) 3.00 FIGURE 4. Real-tme power estmaton and error rate. Error rate s represented as (estmated power measured power)/measured power. In some cases, the gap between measured power and estmated power s less than 47 W wth an error rate of 1.4 percent. the gap between measured power and estmated power was less than 47 W wth an error rate of 1.4 percent. These results are sgnfcant n lght of the optmzaton problem s power constrant. Hgh underestmaton pro b ablty and error rate can lead to a volaton of the rack-level powercappng requrement. Smulaton results We compared the total IT equpment power and wasted power from UPSs n both our optmal workload allocaton and a baselne case that evenly allocates workload among racks. We assumed that each rack hosts 20 servers, and that each rack s power s suppled by one or two UPSs wth an N+1 or a UPS confguraton. Our smulaton was for 50 racks runnng 10 applcatons smultaneously. We used Mathematca to perform the smulaton, whch ended when ether the teratons exceeded a predefned threshold or the results converged to the requested precson. Power-reducton comparson. Fgure 5 shows the power reducton wth our optmal workload allocaton relatve to the baselne allocaton (evenly dstrbuted workloads). For both the N+1 and UPS confguratons, the optmal workload allocaton reduces power consumpton by 1.23 percent to 5.20 percent. The N+1 confguraton has a slghtly hgher power-reducton rate, but overall the rate gap between confguratons s small for all loads. Optmal workload allocaton acheves the hghest power reducton at the datacenter utlzaton of 50 percent, whch s also the average level for most datacenters. 15 The degree of power reducton depends on the UPS effcency curve and the alpha coeffcent value (α ) n Equaton 1. As an extreme example, f UPS effcency s constant, power reducton wll be zero for all datacenter-utlzaton levels. That s, regardless of workload dstrbuton, UPS effcency (total UPS output power) s constant for a partcular workload sze. Because UPS effcency s constant, UPS nput power and power loss are also constant. Because the alpha coeffcent de cdes the power-ncrease rate for a specfc applcaton, t affects the datacenter s power consumpton (UPS output power), whch s why a dfferent workload allocaton mght have a dfferent power consumpton for the same workload sze. Workload-type effects. To better un - der stand how workload type affects power reducton, we changed the applcaton mx whle keepng total datacenter utlzaton at 50 percent. In ths smulaton, we dvded the 10 applcatons nto two categores CPU-ntensve and Web servce and then mxed the types wth dfferent proportons. As Fgure 6 shows, a greater proporton of CPU-ntensve applcatons translates to hgher power reducton, wth the maxmum reducton at 85 percent CPU-ntensve applcatons and 15 percent Web servce applcatons. These results are predctable: the more CPU-ntensve applcatons there are, the more power goes to the rack. The hgher rack power ncreases the UPS s loaded capacty, whch could result n lower power loss. Translaton to energycost reducton Datacenter power capacty vares consderably. As of 2011, t was estmated to be anywhere from 2 to 100 MW (www.greenpeace.org/nternatonal/global 50 COMPUTER WWW.COMPUTER.ORG/COMPUTER

6.00 5.00 N + 1 /nternatonal/publcatons/clmate /2011/Cool%20IT/drty-data-facltes -table-greenpeace.pdf), wth half the datacenters surveyed fallng between 20 and 76 MW. Accordng to a 2014 Uptme Insttute report (https:// ournal.uptmensttute.com/2014 -data-center-ndustry-survey), the average data center PUE s 1.7. For a datacenter wth a 76-MW capacty, IT equpment power consumpton would be 44.7 MW. Wth the 5.2 maxmum power- consumpton reducton demonstrated n our smulaton, the datacenter could reduce ts annual energy cost by 44.7 MW per day, whch s roughly $1.4 mllon (44.7 365 days 24 h 0.07 $/kwh 0.052). Our experments wth the racklevel power model show that UPS confguratons sgnfcantly affect a datacenter s energy effcency and that UPS power loss s dfferent when IT workload changes. Along wth our workload-placement calculatons, the model mnmzes IT equp ment power consumpton and UPS power loss wth up to a 5.2 percent power reducton relatve to an even workload-allocaton strategy. In future work, we wll focus on how DVFS and swtchng servers on and off can enhance UPS effcency. ACKNOWLEDGMENTS Ths work s supported n part by Natonal Scence Foundaton (NSF) grant CNS- 1205338. We thank Wayne State Unversty s Computng and Informaton Technology Department for ts sgnfcant asss tance and NextEnergy for collectng data and runnng experments. The research reported n ths artcle s based on work done by Power reducton (%) 4.00 3.00 2.00 1.00 0.00 0 10 Wesong Sh whle he was at NSF. REFERENCES 1. A. Gandh et al., Optmal Power Allocaton n Server Farms, Proc. ACM Int l Conf. Measurement and Modelng of Computer Systems (SIGMET- RICS 09), 2009, pp. 157 168. 2. Q. Deng et al., Coscale: Coordnatng CPU and Memory System DVFS n Server Systems, Proc. 45th IEEE/ ACM Int l Symp. Mcroarchtecture (MICRO 12), 2012, pp. 143 154. 20 30 40 50 60 70 80 90 100 Datacenter utlzaton (%) FIGURE 5. Power reducton wth our optmal workload allocaton (optmal power) relatve to an evenly allocated workload (baselne) for an N+1 and a UPS confguraton. Power reducton s represented as (baselne power optmal power)/baselne power. Power reducton s hghest for both confguratons when datacenter utlzaton s 50 percent. Power reducton (%) 6.00 5.00 4.00 3.00 2.00 1.00 0.00 0:1.0 N + 1 0.1:0.9 0.2:0.8 0.3:0.7 0.4:0.6 0.5:0.5 0.6:0.4 0.7:0.3 Workload dstrbuton (CPU:Web) 0.8:0.2 0.85:0.15 FIGURE 6. Power reducton wth dfferent proportons of CPU-ntensve (CPU) and Web servce (Web) applcatons when datacenter utlzaton s 50 percent. Power reducton s (baselne power optmal power)/baselne power. 0.9:0.1 3. J.S. Chase et al., Managng Energy and Server Resources n Hostng Centers, ACM SIGOPS Operatng Systems Rev., vol. 35, no. 5, 2001, pp. 103 116. 4. R. Nathu and K. Schwan, Vrtual Power: Coordnated Power Management n Vrtualzed Enterprse Systems, ACM SIGOPS Operatng Systems Rev., vol. 41, no. 6, 2007, pp. 265 278. 5. Q. Tang et al., Energy-Effcent Thermal-Aware Task Schedulng 1.0:0 FEBRUARY 2016 51

CLOUD COMPUTING ABOUT THE AUTHORS QUAN ZHANG s a doctoral researcher n computer scence at Wayne State Unversty. Hs research nterests nclude dstrbuted systems, cloud computng, and energy-effcent computng. Zhang receved a BS n computer scence from Tong Unversty. He s a student member of IEEE. Contact hm at quan.zhang@ wayne.edu. WEISONG SHI s a professor of computer scence at Wayne State Unversty. Hs research nterests nclude energy-effcent computer systems and software, Internet computng, and moble health. Sh receved a PhD n computer engneerng from the Chnese Academy of Scences. He s an IEEE Fellow and a Senor Member of ACM. Contact hm at wesong@wayne.edu. for Homogeneous Hgh-Performance Int l Symp. Low Power Electroncs and Computng Data Centers: Desgn (ISLPED 01), 2001, pp. 135 140. A Cyber-Physcal Approach, IEEE 8. H. Davd et al., Rapl: Memory Power Trans. Parallel and Dstrbuted Systems, Estmaton and Cappng, Proc. 15th vol. 19, no. 11, 2008, pp. 1458 1472. ACM/IEEE Int l Symp. Low-Power 6. A. Verma, P. Ahua, and A. Neog, Electroncs and Desgn (ISLPED 10), Pmapper: Power and Mgraton Cost 2010, pp. 189 194. Aware Applcaton Placement n Vrtualzed Systems, Proc. 9th ACM Dsk Power Consumpton, Proc. 2nd 9. J. Zedlewsk et al., Modelng Hard- /IFIP/USENIX Int l Conf. Mddleware USENIX Conf. Fle and Storage Technologes (FAST 03), 2003, pp. 217 230. (Mddleware 08), 2008, pp. 243 264. 7. R. Joseph and M. Martonos, Runtme 10. C. Lefurgy, X. Wang, and M. Ware, Power Estmaton n Hgh- Performance Server-Level Power Control, Mcroprocessors, Proc. 6th ACM/IEEE Proc. 4th IEEE Int l Conf. Autonomc Computng (ICAC 07), 2007, pp. 4 14. 11. D. Mesner, B.T. Gold, and T.F. Wen sch, The Powernap Server Archtecture, ACM Trans. Computer Systems, vol. 29, no. 1, 2011, pp. 3.1 3.24. 12. A. Kansal et al., Vrtual Machne Power Meterng and Provsonng, Proc. 1st ACM Symp. Cloud Computng (SOCC 10), 2010, pp. 39 50. 13. E. Pakbazna and M. Pedram, Mnmzng Data Center Coolng and Server Power Costs, Proc. 14th ACM /IEEE Int l Symp. Low-Power Electroncs and Desgn (ISLPED 01), 2009, pp. 145 150. 14. V. Kontorns et al., Managng Dstrbuted UPS Energy for Effectve Power Cappng n Datacenters, Proc. 39th Ann. IEEE/ACM Int l Symp. Computer Archtecture (ISCA 12), 2012, pp. 488 499. 15. L.A. Barroso and U. Hölzle, The Data Center as a Computer: An Introducton to the Desgn of Warehouse-Scale Machnes, Morgan and Claypool, 2009. Selected CS artcles and columns are also avalable for free at http://computngnow.computer.org. Engneerng and Applyng the Internet IEEE Internet Computng reports emergng tools, technologes, and applcatons mplemented through the Internet to support a worldwde computng envronment. For submsson nformaton and author gudelnes, please vst www.computer.org/nternet/author.htm 52 COMPUTER WWW.COMPUTER.ORG/COMPUTER