About Circuit Cellar Staff

Circuit Cellar's editorial team comprises professional engineers, technical editors, and digital media specialists. You can reach the Editorial Department at editorial@circuitcellar.com, @circuitcellar, and facebook.com/circuitcellar

Budgeting Power in Data Centers

In my May 2014 Circuit Cellar article, “Data Centers in the Smart Grid” (Issue 286), I discussed the growing data center energy challenge and a novel potential solution that modulates data center power consumption based on the requests from the electricity provider. In the same article, I elaborated on how the data centers can provide “regulation service reserves” by tracking a dynamic power regulation signal broadcast by the independent service operator (ISO).

Demand-side provision of regulation service reserves is one of the ways of providing capacity reserves that are picking up traction in US energy markets. Frequency control reserves and operating reserves are other examples. These reserves are similar to each other in the sense that the demand-side, such as a data center, modulates its power consumption in reaction to local measurements and/or to signals broadcast by the ISO. The time-scale of modulation, however, differs depending on the reserves: modulation can be done in real time, every few seconds, or every few minutes.

In addition to the emerging mechanisms of providing capacity reserves in the grid, there are several other options for a data center to manage its electricity cost. For example, the data center operators can negotiate electricity pricing with the ISO such that the electricity cost is lower when the data center consumes power below a given peak value. In this scenario, the electricity cost is significantly higher if the center exceeds the given limit. “Peak shaving,” therefore, refers to actively controlling the peak power consumption using data center power-capping mechanisms. Other mechanisms of cost and capacity management include load shedding, referring to temporary load reduction in a data center, load shifting, which delays executing loads to a future time, and migration of a subset of loads to other facilities, if such an option is available.

All these aforementioned mechanisms require the data center to be able to dynamically cap its power within a tolerable error margin. Even in absence of advanced cost management strategies, a data center generally needs to operate under a predetermined maximum power consumption level as the electricity distribution infrastructure of the data center needs to be built accordingly.

This article appears in Circuit Cellar 292.

Most data centers today run a diverse set of workloads (applications) at a given time. Therefore, an interesting sub-problem of the power capping problem is how to distribute a given total power cap efficiently among the computational, cooling, and other components in a data center. For example, if there are two types of applications running in a data center, should one give equal power caps to the servers running each of these applications, or should one favor one of the applications?

Even when the loads have the same level of urgency or priority, designating equal power to different types of loads does not always lead to efficient operation. This is because the power-performance trade-offs of applications vary significantly. One application may meet user quality-of-service (QoS) expectations or service level agreements (SLAs) while consuming less power compared to another application.

Another reason that makes the budgeting problem interesting is the temperature and cooling related heterogeneity among the servers in a data center. Even when servers in a data center are all of the same kind (which is rarely the case), their physical location in the data center, the heat recirculation effects (which refer to some of the heat output of servers being recirculated back into the center and affecting the thermal dynamics), and the heat transfer among the servers create differences in temperatures and cooling efficiencies of servers. Thus, while budgeting, one may want to dedicate larger power caps to servers that are more cooling-efficient.

As the computational units in a data center need to operate at safe temperatures below manufacturer-provided limits, the budgeting policy in the data center needs to make sure a sufficient power budget is saved for the cooling elements. On the other hand, if there is over-cooling, then the overall efficiency drops because there is a smaller power budget left for computing.

I refer to the problem of how to efficiently allocate power to each server and to the cooling units as the “power budgeting” problem. The rest of the article elaborates on how this problem can be formulated and solved in a practical scenario.

Characterizing Loads

For distributing a total computational power budget in an application-aware manner, one needs to have an estimate of the relationship between server power and application performance. In my lab at Boston University, my students and I studied the relationship between application throughput and server power on a real-life system, and constructed empirical models that mimic this relationship.

Figure 1 demonstrates how the relationship between the instruction throughput and power consumption of a specific enterprise server changes depending on the application. Another interesting observation out of this figure is that, performance of some of the applications saturates beyond a certain power value. In other words, even when a larger power budget is given to such an application by letting it run with more threads (or in other cases, letting the processor operate at a higher speed), the application throughput does not improve further.

Figure 1: The plot demonstrates billion of instructions per second (BIPS) versus server power consumption as measured on an Oracle enterprise server including two SPARC T3 processors.

Figure 1: The plot demonstrates billion of instructions per second (BIPS) versus server power consumption as measured on an Oracle enterprise server including two SPARC T3 processors.

Estimating the slope of the throughput-power curve and the potential performance saturation point helps make better power budgeting decisions. In my lab, we constructed a model that estimates the throughput given server power and hardware performance counter measurements. In addition, we analyzed the potential performance bottlenecks resulting from a high number of memory accesses and/or the limited number of software threads in the application. We were able to predict the saturation point for each application via a regression-based equation constructed based on this analysis. Predicting the maximum server power using this empirical modeling approach gave a mean error of 11 W for our 400-to-700-W enterprise server.[1]

Such methods for power-performance estimations highlight the significance of telemetry-based empirical models for efficient characterization of future systems. The more detailed measurement capabilities newer computing systems can provide—such as the ability to measure power consumption of various sub-components of a server—the more accuracy one can achieve in constructing models to help with the data center management.

Temperature, Once Again

In several of my earlier articles this year, I emphasized the key role of temperature awareness for improving computing energy efficiency. This key role is a result of the high cost of cooling, the fact that server energy dynamics also rely on temperature substantially (i.e., consider the interactions among temperature, fan power and leakage power), and the impact of processor thermal management policies on performance.

Solving the budgeting problem efficiently, therefore, relies on having good estimates for how a given power allocation among the servers and cooling units would affect the temperature. The first step is estimating the CPU temperature for a given server power cap. In my lab, we modeled the CPU temperature as a function of the CPU junction-to-air thermal resistance, CPU power, and the inlet temperature to the server. CPU thermal resistance is determined by the hardware and packaging choices, and can be characterized empirically. For a given total server power, CPU power can be estimated using performance counter measurements in a similar way to estimating the performance given a server cap, as described above (see Figure 1). Our simple empirical temperature model was able to estimate temperature with a mean error of 2.9°C in our experiments on an Oracle enterprise server.[1]

Heat distribution characteristics of a data center depend strongly on the cooling technology used. For example, traditional data centers use a hot aisle-cold aisle configuration, where the cold air from the computer room air conditioners (CRAC) and the hot air coming out of the serves are separated by the rows of racks that contain the servers. The second step in thermal estimation, therefore, has to do with estimating the impact of servers to one another and the overall impact of the cooling system.

In a traditional hot-cold aisle setting, the inlet server temperatures can be estimated based on a heat distribution matrix, power consumption of all the servers, and the CRAC air temperature (which is the cold air input to the data center). Heat distribution matrix can be considered as a lumped model representing the impact of heat recirculation and the air flow properties together in a single N × N matrix, where N is the number of servers.[2]

Recently, using in-row coolers that leverage liquid cooling to improve efficiency of cooling is preferred in some (newer) data centers to improve cooling efficiency. In such settings, the heat recirculation effects are expected to be less significant as the most of the heat output of the servers is immediately removed from the data center.

In my lab, my students and I used low-cost data center temperature models to enable fast dynamic decisions.[1] Detailed thermal simulation of data centers is possible through computational fluid dynamics tools. Such tools, however, typically require prohibitively long simulation times.

Budgeting Optimization

What should the goal be during power budgeting? Maximizing overall throughput in the data center may seem like a reasonable goal. However, such a goal would favor allocating larger power caps to applications with higher throughput, and absolute throughput does not necessarily give an idea on whether the application QoS demand is met. For example, an application with a lower BIPS may have a stricter QoS target.

Consider this example for a better budgeting metric: the fair speed-up metric computes the harmonic mean of per-server speedup (i.e., per-server speedup is the ratio of measured BIPS to the maximum BIPS for an application). The purpose of this metric is to ensure none of the applications are starving while maximizing overall throughput.

It is also possible to impose constraints on the budgeting optimization such that a specific performance or throughput level is met for one or more of the applications. Ability to meet such constraints strongly relies on the ability to estimate the power-vs.-performance trends of the applications. Thus, empirical models I mentioned above are also essential for delivering more predictable performance to users.

Figure 2 demonstrates how the hill-climbing strategy my students and I designed for optimizing fair speed up evolves.  The algorithm starts setting the CRAC temperature to its last known optimal value, which is 20.6°C in this example. The CRAC power consumption corresponding to providing air input to the data center at 20.6°C can be computed using the relationship between CRAC temperature and the ratio of computing power to cooling power.[3] This relationship can often be derived from datasheets for the CRAC units and/or for the data center cooling infrastructure.

Figure 2: The budgeting algorithm starts from the last known optimal CRAC temperature value, and then iteratively aims to improve on the objective.

Figure 2: The budgeting algorithm starts from the last known optimal CRAC temperature value, and then iteratively aims to improve on the objective.

Once the cooling power is subtracted from the overall cap, the algorithm then allocates the remaining power among the servers with the objective of maximizing the fair speed up. Other constraints in the optimization formulation prevent any server to exceed manufacturer-given redline temperatures and ensure each server to receive a feasible power cap that falls between the server’s minimum and maximum power consumption levels.

The algorithm then iteratively searches for a better solution as demonstrated in steps 2 to 6 in Figure 2. Once the algorithm detects that the fair speed up is decreasing (e.g., fair speedup in step 6 is less than the speedup in step 5), it converges to the solution computed in the last step (e.g., converges to step 5 in the example). Note that setting cooler CRAC temperatures typically indicate a larger amount of cooling power, thus the fair speedup drops. However, as the CRAC temperature increases beyond a point, the performance of the hottest servers are degraded to maintain CPU temperatures below the redline; thus, a further increase in the CRAC temperature is not useful any longer (as in step 6).

This iterative algorithm took less than a second of running time using Matlab CVX[4] in our experiments for a small data center of 1,000 servers on an average desktop computer. This result indicates that the algorithm can be run in much shorter time with an optimized implementation, allowing for frequent real-time re-budgeting of power in a modern data center with a larger number of servers. Our algorithm improved fair speedup and BIPS per Watt by 10% to 20% compared to existing budgeting techniques.

Challenges

The initial methods and results I discussed above demonstrate promising energy efficiency improvements; however, there are many open problems for data center power budgeting.

First, the above discussion does not consider loads with some dependence to each other. For example, high-performance computing applications often have heavy communication among server nodes. This means that the budgeting method needs to account for the impact of inter-node communication for performance estimates as well as while making job allocation decisions in data centers.

Second, especially for data centers with a non-negligible amount of heat recirculation, thermally-aware job allocation significantly affects CPU temperature. Thus, job allocation should be optimized together with budgeting.

In data centers, there are elements other than the servers that consume significant amounts of power such as storage units. In addition there are a heterogeneous set of servers. Thus, a challenge lies in budgeting the power to a heterogeneous computing, storage, and networking elements.

Finally, the discussion above focuses on budgeting a total power cap among servers that are actively running applications. One can, however, also adjust the number of servers actively serving the incoming loads (by putting some servers into sleep mode/turning them off) and also consolidate the loads if desired. Consolidation often decreases performance predictability. The server provisioning problem needs to be solved in concert with the budgeting problem, taking the additional overheads into account. I believe all these challenges make the budgeting problem an interesting research problem for future data centers.

 

Ayse CoskunAyse K. Coskun (acoskun@bu.edu) is an assistant professor in the Electrical and Computer Engineering Department at Boston University. She received MS and PhD degrees in Computer Science and Engineering from the University of California, San Diego. Coskun’s research interests include temperature and energy management, 3-D stack architectures, computer architecture, and embedded systems. She worked at Sun Microsystems (now Oracle) in San Diego, CA, prior to her current position at BU. Coskun serves as an associate editor of the IEEE Embedded Systems Letters.

 

 
[1] O. Tuncer, K. Vaidyanathan, K. Gross, and A. K. Coskun, “CoolBudget: Data Center Power Budgeting with Workload and Cooling Asymmetry Awareness,” in Proceedings of IEEE International Conference on Computer Design (ICCD), October 2014.
[2] Q. Tang, T. Mukherjee, S. K. S. Gupta, and P. Cayton, “Sensor-Based fast Thermal Evaluation Model for Energy Efficient High-Performance Datacenters,” in ICISIP-06, October 2006.
[3] J. Moore, J. Chase, P. Ranganathan, and R. Sharma, “Making Scheduling ‘Cool’: Temperature-Aware Workload Placement in Data Centers,” in USENIX ATC-05, 2005.
[4] CVX Research, “CVX: Matlab Software for Disciplined Convex Programming,” Version 2.1, September 2014, http://cvxr.com/cvx/.

Polymer Capacitors for Industrial Applications

KEMET introduced new automotive-grade polymer capacitors at Electronica 2014. The T591 high-performance automotive-grade polymer tantalum series delivers stability and endurance under harsh humidity and temperature conditions. It is available in capacitances up to 220 µF and rated up to 10 V, with operating temperatures up to 125°C.KEMET Corporation Auto Polymer

“The T591 Series was developed with enhancements in polymer materials, design and manufacturing processes to meet the increasing demands of the telecommunications, industrial, and now, automotive segments,” Dr. Philip Lessner, KEMET Senior Vice President and Chief Technology Officer, was quoted saying a release.

You can use the series for a variety of projects, such as decoupling and filtering of DC-to-DC converters in automotive applications or industrial applications in harsh conditions.

Source: KEMET

 

 

24-Bit Sigma Delta A/D Converter

Analog Devices recently announced a 24-bit sigma-delta A/D converter with a fast and flexible output data rate for high-precision instrumentation and process control applications

The AD7175-2 converter delivers 24 noise-free bits at 20 SPS and 17.2 noise-free bits at 250 ksps providing you with a wider dynamic range. With twice the throughput for the same power consumption versus competing solutions, the AD7175-2 enables faster, more responsive measurement systems providing a 50-ksps/channel scan rate with a 20-µs settling time.Analog-AD7175-2-Product-Release-Image

The integrated, low-noise, true rail-to-rail input buffer enables quick and easy sensor interfacing, reduces design and layout complexity, simplifies analog drive circuitry and reduces PCB area. The AD717x family, with a wide range of pin and software compatible devices, allows consolidation and standardization across system platforms.

According to Analog Devices, the converter gives “designers a wider dynamic range, which enables smaller signal deviations to be measured as required within analytical laboratory instrumentation systems.”

Specs and features:

  • 2x the throughput for the same power consumption in comparison to other devices
  • Enables faster measurement systems providing a 50-ksps/channel scan rate with 20-µs settling time.
  • Integrated true rail-to-rail input buffer for easy sensor interfacing and simplified analog drive circuitry
  • User-configurable input channels
  • 2 differential or 4 single-ended channels
  • Per-channel independent programmability
  • Integrated 2.5-V buffered 2-ppm/°C reference
  • Flexible and per-channel programmable digital filters
  • Enhanced filters for simultaneous 50-Hz and 60-Hz rejection
  • −40°C to +105°C operating temperature range

Source: Analog Devices

Cypress Enters Bluetooth Low Energy Market

Cypress Semiconductor announced at Electronica 2014 two integrated, single-chip Bluetooth Low Energy (BLE) solutions for low-power, sensor-based Internet of Things (IoT) systems: the PSoC 4 and PRoC.

Cypress BLE Pioneer Kit

Cypress BLE Pioneer Kit

According to Cypress, the PSoC 4 BLE delivers “unprecedented ease-of-use and integration in a customizable solution for IoT applications, home automation, healthcare equipment, sports and fitness monitors, and other wearable smart devices.” The PRoC (programmable radio-on-chip) is “intended for wireless Human Interface Devices (HIDs), remote controls, and applications that require wireless connectivity.”

Cypress also announced BLE development kits and reference designs.

  • CY8CKIT-042-BLE Development Kit: The kit includes a USB BLE dongle that “pairs with the CySmart master emulation tool, converting a designer’s Windows PC into a Bluetooth LE debug environment.”
  • CY5672 PRoC BLE Remote Control Reference Design Kit: The remote control has a trackpad to detect two- and one-finger gestures and includes a built-in microphone.
  • CY5682 PRoC BLE Touch Mouse Reference Design Kit: The touch mouse reference design includes buttons that map to common user interface shortcuts for Windows 8.

According to Cypress, the PSoC 4 BLE and PRoC BLE solutions are currently sampling in 68-ball CSP and 56-pin QFN packages.  Production is expected in December 2014.

Source: Cypress Semiconductor