Continuous energy measurements in Stellenbosch University’s main data centre (DC) have been taking place since April this year. We now have minute-by-minute data spanning the coldest season and some summer months – and the results are starting to become interesting.
A set of charts for two representative days – a hot day (20 Nov) and a coldish day (15 July) – follows (click on the charts to see them full size). The outdoor temperatures (Figs 1 & 2) were logged by a weather station in the vicinity of the data centre. On 20 Nov the temperature peaked at 33.5 oC and bottomed out at below 16 oC in the early morning. On 15 July the temperature peaked at just above 13 oC and remained in a narrow range over the 24 hr period.
Figures 3 and 4 show the Mains power supplied to the DC and the output power from the UPSs, which serves as an approximation of power supplied to the computing equipment, for each of the days. Several interesting and expected observations can be made:
- SU’s DC is very small when compared to commercial DCs – it consumes at a rate of less than 180 kW or between 1.1 and 1.3 GWh pa. Nevertheless, Albert Meyer observes that this is triple the energy consumption in 2002 when the UPSs were replaced – a direct consequence of the e-Campus Initiative?
- There is a discernible daily cycle in the total power supplied where power consumption is higher during daytime hours. As expected more power is consumed on hot days and the cycle profile is more pronounced on such days: consumption remained below 150kW (ave: 138 kW) on 15 July, but almost reached 170kW (ave: 149 kW) on 20 Nov. One can assume that this is the effect of the power consumed by air-cooling and ventilation equipment.
- However, the power supplied to computing equipment remains constant over the 24-hr cycle (and over working and recess days!) at around 64-65 kW. One can conclude that server power consumption remains constant irrespective of useful work being performed or not.
Figures 5 shows 1-minute values for PUE – or (total power supplied) over (power supplied to computing equipment). While PUE hovers around 2.17 on cool days, the average rises to 2.3 on hot days, while the peaks become more marked, clearing 2.5 often and averaging around 2.4 during daylight hours. (State-of-the-art DCs such as Google’s can reach PUEs of 1.1; here‘s a post on Microsoft’s new free-air-cooled facility in Ireland with an expected annual average PUE of 1.25). The increase in our DC’s PUE can almost entirely be ascribed to the increase in cooling load.
The PUE is in fact better than we would have guessed, and this is possibly due to the more efficient refrigeration air-cooled units that were installed in 2005. It is difficult to determine what a realistic target for our DC’s PUE should be without a thorough analysis and perhaps, simulation, of what the energy-saving returns-on-investment of various possible interventions would be. But I would guess that bringing the average PUE to below 2.0 perhaps as far as 1.8 would not be too difficult nor expensive to achieve.
What should the next steps be?
Any interventions would have to be implemented systematically in order to determine each intervention’s contribution to increased efficiency. Such an approach is complicated by the fact that the DC is by definition a dynamic environment with gear being added and removed regularly. Nevertheless, there are relatively simple, low-cost, HVAC interventions that may yield relatively significant improvements:
- Albert Meyer, the IT manager responsible for the DC, has pointed out that the IT racks only occupy up to 20% of the volume being cooled. The DC could be split into isolated sections so that only the section containing the racks need be cooled optimally. The unused area could be stripped and redesigned to optimise for energy efficiency.
- Install hot/cold aisle insulated enclosures and blanking panels.
- Systematically increase HVAC set points. We are probably and conservatively running our servers unnecessarily cool. Fine-grained power and environmental monitoring at server and rack level is required however in order to determine safe margins.
Of course, the fact that the servers are consuming constant power irrespective of whether useful processing is being done or not suggests at least the following interventions:
- Investigating and implementing intelligent server power management.
- Implementing intelligent PDUs.
- Intensifying server virtualisation.
Measurement issues
The act of measuring energy consumption has also revealed various anomalies given that this is a relatively old facility without wiring documentation. For instance, what we have assumed to be the main supply to the DC only, may also be supplying various auxiliary, non-DC loads in the office block, such as the elevator and the IT Helpdesk. Similarly, the UPSs may be supplying non-computing loads such as rack-level cooling and ventilation. These anomalies must be investigated and eliminated.
Short, ten-minute periods of anomalous readings simultaneously on all our meters have been observed at numerous times. The cause(s) needs to be established.
Financial incentives
In common with most universities, I guess, the IT Division does not pay the electricity bill for the DC. As with all power bills, they are centrally processed and paid, thus providing no financial incentive to IT to invest to improve efficiency. If one considers that at current energy tariffs, the DC’s contribution to the bill amounts to approximately R750000 pa and as the proposed tariff increases kick in it will rapidly head north of R1 million, the ROI for energy-efficiency investments is obvious. But the returns need to accrue to IT in order to finance the interventions…
We would love to know what PUE other university DCs worldwide, especially those in a similar Mediterranean climate, are achieving.
Tags: data centre, DCiE, energy meters, PUE