## Cs.zju.edu.cn

**Optimal Power Allocation in Server Farms**
Carnegie Mellon University
Carnegie Mellon University
Pittsburgh, PA, USA
Pittsburgh, PA, USA
Hawthorne, NY, USA
improve server farm performance, by a factor of typically
Server farms today consume more than 1.5% of the total
1.4 and as much as a factor of 5 in some cases.

electricity in the U.S. at a cost of nearly $4.5 billion. Giventhe rising cost of energy, many industries are now seeking
Categories and Subject Descriptors
solutions for how to best make use of their available power.

C.4 [Performance of Systems]: Modeling techniques
An important question which arises in this context is

*howto distribute available power among servers in a server farmso as to get maximum performance*.

By giving more power to a server, one can get higher server
Theory, Experimentation, Measurement, Performance
frequency (speed). Hence it is commonly believed that, fora given power budget, performance can be maximized by
operating servers at their highest power levels. However,it is also conceivable that one might prefer to run servers
Servers today consume ten times more power than they
at their lowest power levels, which allows more servers to be
did ten years ago [3, 21]. Recent articles estimate that a
turned on for a given power budget. To fully understand the
300W high performance server requires more than $330 of
effect of power allocation on performance in a server farm
energy cost per year [24]. Given the large number of servers
with a fixed power budget, we introduce a queueing theo-
in use today, the worldwide expenditure on enterprise power
retic model, which allows us to predict the optimal power
and cooling of these servers is estimated to be in excess of
allocation in a variety of scenarios. Results are verified via
$30 billion [21].

extensive experiments on an IBM BladeCenter.

Power consumption is particularly pronounced in CPU
We find that the optimal power allocation varies for differ-
intensive

*server farms *composed of tens to thousands of
ent scenarios. In particular, it is not always optimal to run
servers, all

*sharing workload and power supply*. We consider
servers at their maximum power levels. There are scenar-
server farms where each incoming job can be routed to any
ios where it might be optimal to run servers at their lowest
server, i.e., it has no affinity for a particular server.

power levels or at some intermediate power levels. Our anal-
Server farms usually have a fixed peak power budget. This
ysis shows that the optimal power allocation is non-obvious
is because large power consumers operating server farms are
and depends on many factors such as the power-to-frequency
often billed by power suppliers, in part, based on their peak
relationship in the processors, the arrival rate of jobs, the
power requirements. The peak power budget of a server
maximum server frequency, the lowest attainable server fre-
farm also determines its cooling and power delivery infras-
quency and the server farm configuration. Furthermore, our
tructure costs. Hence, companies are interested in maxi-
theoretical model allows us to explore more general settings
mizing the performance at a server farm given a fixed peak
than we can implement, including arbitrarily large server
power budget [4, 8, 9, 21].

farms and different power-to-frequency curves. Importantly,
The power allocation problem we consider is:

*how to dis-*
we show that the optimal power allocation can significantly

*tribute available power among servers in a server farm soas to minimize mean response time*. Every server running a

*∗*Research supported by NSF SMA/PDOS Grant CCR-
given workload has a minimum level of power consumption,
0615262 and a 2009 IBM Faculty Award.

*b*, needed to operate the processor at the lowest allowablefrequency and a maximum level of power consumption,

*c*,needed to operate the processor at the highest allowable fre-quency. By varying the power allocated to a server within
Permission to make digital or hard copies of all or part of this work for
the range of

*b *to

*c *Watts, one can proportionately vary the
personal or classroom use is granted without fee provided that copies are
server frequency (See Fig. 2). Hence, one might expect that
not made or distributed for profit or commercial advantage and that copies
running servers at their highest power levels of

*c *Watts,
bear this notice and the full citation on the first page. To copy otherwise, to
which we refer to as

*PowMax*, is the optimal power allocation
republish, to post on servers or to redistribute to lists, requires prior specific
scheme to minimize response time. Since we are constrained
permission and/or a fee.

*SIGMETRICS/Performance'09, *June 15–19, 2009, Seattle, WA, USA.

by a power budget, there are only a limited number of servers
Copyright 2009 ACM 978-1-60558-511-6/09/06 .$5.00.

that we can operate at the highest power level. The rest of
the servers remain turned off. Thus PowMax corresponds to
age and Frequency Scaling) and DVFS+DFS. Section 2 dis-
having

*few fast *servers. In sharp contrast is

*PowMin*, which
cusses these mechanisms in more detail. The functional
we define as operating servers at their lowest power levels of
form of the power-to-frequency relationship for a server de-

*b *Watts. Since we spend less power on each server, PowMin
pends on many factors such as the workload used, maxi-
corresponds to having

*many slow *servers. Of course there
mum server power, maximum server frequency and the volt-
might be scenarios where we neither operate our servers at
age and frequency scaling mechanism used (DFS, DVFS
the highest power levels nor at the lowest power levels, but
or DVFS+DFS). Unfortunately, the functional form of the
we operate them at some intermediate power levels. We
server level power-to-frequency relationship is only recently
refer to such power allocation schemes as

*PowMed*.

beginning to be studied (See the 2008 papers [21, 25]) and is
Understanding power allocation in a server farm is intrin-
still not well understood. Our first contribution is the inves-
sically difficult for many reasons: First, there is no single
tigation of how power allocation affects server frequency in a
allocation scheme which is optimal in all scenarios. For ex-
single server using DFS, DVFS, and DVFS+DFS for various
ample, it is commonly believed that PowMax is the optimal
workloads. In particular, in Section 3 we derive a functional
power allocation scheme [1, 7]. However, as we show later,
form for the power-to-frequency relationship based on our
PowMin and PowMed can sometimes outperform PowMax
measured values (See Figs. 2(a) and (b)).

by almost a factor of 1.5. Second, it turns out that the
Our second contribution is the development of a queue-
optimal power allocation depends on a very long list of ex-
ing theoretic model which predicts the mean response time
ternal factors, such as the outside arrival rate, whether an
for a server farm as a function of many factors including
open or closed workload configuration is used, the power-
the power-to-frequency relationship, arrival rate, peak power
to-frequency relationship (how power translates to server
budget, etc. The queueing model also allows us to determine
frequency) inherent in the technology, the minimum power
the optimal power allocation for every possible configuration
consumption of a server (

*b *Watts), the maximum power that
of the above factors (See Section 4).

a server can use (

*c *Watts), and many other factors. It is sim-
Our third contribution is the experimental implementa-
ply impossible to examine all these factors via experiments.

tion of our schemes, PowMax, PowMin, and PowMed, on
To fully understand the effect of power allocation on mean
an IBM BladeCenter, and the measurement of their response
response time in a server farm with a fixed power budget,
time for various workloads and voltage and frequency scal-
we introduce a queueing theoretic model, which allows us to
ing mechanisms (See Sections 5 and 6). Importantly, our
predict the optimal power allocation in a variety of scenarios.

experiments show that using the optimal power allocation
We then verify our results via extensive experiments on an
scheme can significantly reduce mean response time, some-
IBM BladeCenter.

times by as much as a factor of 5. To be more concrete, we
Prior work in power management has been motivated by
show a

*subset *of our results in Fig. 1, which assumes a CPU
the idea of managing power at the global data center level [8,
bound workload in an open loop setting. Fig. 1(a) depicts
20] rather than at the more localized single-server level.

one possible scenario using DFS where PowMax is optimal.

While power management in server farms often deals with
By contrast, Fig. 1(b) depicts a scenario using DVFS where
various issues such as reducing cooling costs, minimizing idle
PowMin is optimal for high arrival rates. Lastly, Fig. 1(c)
power wastage and minimizing average power consumption,
depicts a scenario using DVFS+DFS where PowMed is op-
we are more interested in the problem of allocating peak
timal for high arrival rates.

power among servers in a server farm to maximize perfor-
We experiment with different workloads. While Section 5
mance. Notable prior work dealing with peak power alloca-
presents experimental results for a CPU bound workload,
tion in a server farm includes Raghavendra et al. [21], Femal
LINPACK, Section 6 repeats all the experiments under the
et al. [10] and Chase et al. [5] among others. Raghavendra
STREAM memory bound workload, the WebBench web work-
et al. [21] present a power management solution that co-
load, and other workloads. In all cases, experimental results
ordinates different individual approaches to simultaneously
are in excellent agreement with our theoretical predictions.

minimize average power, peak power and idle power wastage.

Section 7 summarizes our work. Finally, Section 8 discusses
Femal et al. [10] allocate peak power so as to maximize
future applications of our model to more complex situations
throughput in a data center while simultaneously attempt-
such as workloads with varying arrival rates, servers with
ing to satisfy certain operating constraints such as load-
idle (low power) states and power management at the sub-
balancing the available power among the servers. Chase et
system level, such as the storage subsystem.

al. [5] present an auction-based architecture for improvingthe energy efficiency of data centers while achieving some
2. EXPERIMENTAL FRAMEWORK
quality-of-service specifications. We differ from the abovework in that we specifically deal with minimizing mean re-
2.1 Experimental setup
sponse time for a given peak power budget and understand-
Our experimental setup consists of a server farm with up
ing all the factors that affect it.

to fourteen IBM BladeCenter HS21 blade servers featuring
Our contributions
two 3.0 GHz dual-core Intel Woodcrest Xeon processors and
As we have stated, the optimal power allocation scheme
1 GB memory per blade, all residing in a single chassis. We
depends on many factors. Perhaps the most important of
installed and configured Apache as an application server on
these is the specific relationship between the power allocated
each of the blade servers to process transactional requests.

to a server and its frequency (speed), henceforth referred
To generate HTTP requests for the Apache web servers, we
to as the

*power-to-frequency relationship*. There are sev-
employ an additional blade server on the same chassis as
eral mechanisms within processors that control the power-
the workload generator to reduce the effects of network la-
to-frequency relationship. These can be categorized into
tency. The workload generator uses the web server perfor-
DFS (Dynamic Frequency Scaling), DVFS (Dynamic Volt-
mance benchmarking tool httperf [19] in the open server
PowMax beats PowMin
PowMin beats PowMax
PowMax beats PowMin
PowMax beats PowMin
Mean response time (sec)
Mean response time (sec)
High arrival rate
High arrival rate
(a) PowMax is best.

(b) PowMin is best
(c) PowMed is best
at high arrival rates.

at high arrival rates.

Figure 1:

*Subset of our results, showing that no single power allocation scheme is optimal. Fig.(a) depictsa scenario using DFS where PowMax is optimal. Fig.(b) depicts a scenario using DVFS where PowMin isoptimal at high arrival rates whereas PowMax is optimal at low arrival rates. Fig.(c) depicts a scenario usingDVFS+DFS where PowMed is optimal at high arrival rates whereas PowMax is optimal at low arrival rates.*
farm configuration and wbox [23] in the closed server farm
When allocating power to a server, there is a minimum
configuration. We modified and extended httperf and wbox
level of power consumption (

*b*) needed to operate the proces-
to allow for multiple servers and to specify the routing prob-
sor at the lowest allowable frequency and a maximum level
ability among the servers. We measure and allocate power to
of power consumption (

*c*) needed to operate the processor
the servers using IBM's Amester software. Amester, along
at the highest allowable frequency (Of course the specific
with additional scripts, collects all relevant data for our ex-
values of

*b *and

*c *depend on the application that the server
is running). Formally, we define the following notation:Baseline power:

*b *Watts The minimum power consumed
2.2 Voltage and frequency scaling mechanisms
by a fully-utilized server over the allowable range of proces-
Processors today are commonly equipped with mecha-
sor frequency.

nisms to reduce power consumption at the expense of re-
Speed at baseline:

*sb *Hertz The speed (or frequency) of
duced server frequency. Common examples of these mecha-
a fully utilized server running at

*b *Watts.

nisms are Intel's SpeedStep Technology and AMD's Cool ‘n'
Maximum power:

*c *Watts The maximum power con-
Quiet Technology. The power-to-frequency relationship in
sumed by a fully-utilized server over the allowable range of
such servers depends on the specific voltage and frequency
scaling mechanism used. Most mechanisms fall under the
Speed at maximum power:

*sc *Hertz The speed (or fre-
following three categories:
quency) of a fully utilized server running at

*c *Watts.

Dynamic Frequency Scaling (DFS) a.k.a. Clock
Throttling or T-states is a technique to manage powerby running the processor at a less-than-maximum clock fre-quency. Under DFS, the Intel's 3.0 GHz Woodcrest Xeon
3. POWER-TO-FREQUENCY
processors we use allow for 8 operating points which corre-
An integral part of determining the optimal power allo-
spond to effective frequencies of 12.5%, 25%, 37.5%, 50%,
cation is to understand the power-to-frequency relationship.

62.5%, 75%, 87.5%, and 100% of the maximum server fre-
This relationship differs for DFS, DVFS, and DVFS+DFS,
and also differs based on the workload in use. Unfortunately,
Dynamic Voltage and Frequency Scaling (DVFS)
the functional form of the power-to-frequency relationship is
a.k.a. P-states is a more efficient power-savings mech-
not well studied in the literature. The servers we use sup-
anism that reduces server frequency by reducing the pro-
port all three voltage and frequency scaling mechanisms and
cessor voltage and frequency. Under DVFS, our processors
therefore can be used to study the power-to-frequency rela-
allow for 4 operating points which correspond to effective
tionships. In this section, we present our measurements de-
frequencies of 66.6%, 77.7%, 88.9%, and 100% of the maxi-
picted in Figs. 2(a) and (b) showing the functional relation-
mum server frequency.

ship between power allocated to a server and its frequency
DVFS+DFS attempts to leverage both DVFS, and DFS
for the LINPACK [13] workload. We generalize the func-
by applying DFS on the lowest performance state available
tional form of the power-to-frequency relationship to other
in DVFS. Under DVFS+DFS, our processors allow for 11
workloads in Section 6 (See Figs. 8 and 10). Throughout we
operating points which correspond to effective frequencies
assume a homogeneous server farm.

of 8.3%, 16.5%, 25%, 33.3%, 41.6%, 50.0%, 58.3%, 66.6%,
We use the tools developed in [17] to limit the maxi-
77.7%, 88.9%, and 100% of the maximum server frequency.

mum power allocated to each server. Limiting the maximumpower allocated to a server is usually referred to as

*capping*
2.3 Power consumption within a single server
the power allocated to a server. We run LINPACK jobs
back-to-back to ensure that the server is always occupied by
power-to-frequency relationship to be cubic, especially
the workload, and that the server is running at the specified
for the processor. By using a cubic model for DVFS+DFS,
power cap value. Hence, the power values we observe will
we wish to analyze the optimal power allocation policy
be the peak power values for the specified workload. Recall
for those settings.

from Section 2 that the voltage and frequency scaling mech-anisms have certain discrete performance points (in terms of
Approximating DVFS+DFS using a cubic fit gives the fol-
frequency) at which the server can operate. At each of these
lowing relationship between the speed of a server and the
performance points, the server consumes a certain amount
power allocated to it:
of power for a given workload. By quickly dithering between

*s *=

*sb *+

*α0 *3

*P − b.*
available performance states, we can ensure that the server
never consumes more than the set power cap value. In this
Specifically, our experiments show that

*α0 *= 0

*.*39 GHz/ 3

*W *.

way, we also get the best performance from the server for
Also, from our experiments, we found that

*b *= 150W,

*c *=
the given power cap value. Note that when we say power,
250W,

*sb *= 1.2 GHz and

*sc *= 3 GHz for DVFS+DFS.

we mean the

*system-level power *which includes the powerconsumed by the processor and all other components within
4. THEORETICAL RESULTS
Fig. 2(a) illustrates the power-to-frequency curves obtained
The optimal power allocation depends on a large num-
for LINPACK using DFS and DVFS. From the figure, we see
ber of factors including the power-to-frequency relationship
that the power-to-frequency relationship for both DFS and
just discussed in Fig. 2, the arrival rate, the minimum and
DVFS is almost linear. It may seem surprising that the
maximum power consumption levels (

*b *and

*c *respectively),
power-to-frequency relationship for DVFS looks like a lin-
whether the server farm has an open loop configuration or
ear plot. This is opposite to what is widely suggested in
a closed loop configuration, etc.

the literature for the

*processor *power-to-frequency relation-
In order to investigate the effects of all these factors on
ship, which is cubic [11]. The reason why the

*server *power-
the optimal power allocation, we develop a queueing the-
to-frequency relationship is linear can be explained by two
oretic model which predicts the mean response time as a
interrelated factors. First, manufacturers usually settle on
function of all the above factors (See Section 4.1). We then
a limited number of allowed voltage levels (or performance
produce theorems that determine the optimal power allo-
states), which results in a less-than-ideal relationship be-
cation for every possible configuration of the above factors,
tween power and frequency in practice. Second, DVFS is
including open loop configuration (See Theorems 1 and 2 in
not applied on many components at the system level. For
Section 4.2) and closed loop configuration (See Theorems 3,
example, power consumption in memory remains propor-
4 and 5 in Section 4.3).

tional to the number of references to memory per unit time,
4.1 Queueing model
which is only linearly related to the frequency of the pro-cessor. Thus, the power-to-frequency curve for both DFSand DVFS can be approximated as a linear function. Usingthe terminology introduced in Section 2, we approximatethe server speed (or frequency),

*s *GHz, as a function of thepower allocated to it,

*P *Watts, as:

*s *=

*sb *+

*α*(

*P − b*)

*.*
where the coefficient

*α *(units of GHz per Watt) is the slopeof the power-to-frequency curve. Specifically, our experi-ments show that

*α *= 0

*.*008 GHz/W for DVFS and

*α *= 0

*.*03GHz/W for DFS. Also, from our experiments, we find that

*b *= 180W and

*c *= 240W for both DVFS and DFS. How-ever,

*sb *= 2.5 GHz for DVFS and

*sb *= 1.2 GHz for DFS.

The maximum speed in both cases is

*sc *= 3 GHz, which issimply the maximum speed of the processor we use. Notethat the specific values of these parameters change depend-ing on the workload in use. Section 6 discusses our measuredparameter values for different workloads.

For DVFS+DFS, we expect the power-to-frequency rela-
tionship to be piecewise linear since it is a combination ofDVFS and DFS. Experimentally, we see from Fig. 2(b) thatthe power-to-frequency relationship is in fact piecewise lin-ear (3 GHz - 2.5 GHz and then 2.5 GHz - 1.2 GHz).

Figure 3:

*Illustration of our k-server farm model*
Though we could use a piecewise linear fit for DVFS+DFS,
we choose to approximate it using a cubic curve for the fol-lowing reasons:
Fig. 3 illustrates our queueing model for a server farm
1. Using a cubic fit demonstrates how we can extend our
with

*k *servers. We assume that there is a fixed power bud-
results to non-linear power-to-frequency relationships.

get

*P*, which can be split among the

*k *servers, allocating

*i *power to server

*i *where

*i *=

*P *. The correspond-
2. As previously mentioned, several papers consider the
ing server speeds are denoted by (

*s*1

*, . . , sk*). Each server,
Figure 2:

*Power-to-frequency curves for DFS, DVFS, and DVFS+DFS for the CPU bound LINPACK workload.*

Fig.(a) illustrates our measurements for DFS and DVFS. In both these mechanisms, we see that the serverfrequency is linearly related to the power allocated to the server. Fig.(b) illustrates our measurements forDVFS+DFS, where the power-to-frequency curve is better approximated by a cubic relationship.
*i*, receives a fraction

*qi *of the total workload coming in to
timal power allocation is non-trivial, computing

*E*[

*T *] for a
the server farm. Corresponding to any vector of power allo-
given allocation is easy. Hence we omit showing the mean
cation (

*P*1

*, . . , Pk*), there exists an optimal workload allo-
response time in each case and refer the reader to the ap-
cation vector (

*q∗*1

*, . . , q∗k*). We derive the optimal workload
pendix. Due to lack of space, we defer all proofs to the
allocation for each power allocation and use that vector,
appendix. However, we present the intuition behind the
(

*q∗*1

*, . . , q∗k*), both in theory and in the actual experiments.

theorems in each case. Recall from Section 2 that each fully
The details of how we obtain the optimal (

*q∗*1

*, . . , q∗k*) are
utilized server has a minimum power consumption of

*b *Watts
deferred to the appendix.

and maximum power consumption of

*c *Watts.

Our model assumes that the jobs at a server are scheduled
To illustrate our results clearly, we shall assume through-
using the Processor-Sharing (PS) scheduling discipline. Un-
out this section (and the Appendix) that the power budget
der PS, when there are

*n *jobs at a server, they each receive

*P *is such that PowMax allows us to run

*n *servers (each at
1/n

*th *of the server capacity. PS is identical to Round-Robin
power

*c*) and PowMin allows us to run

*m *servers (each at
with quantums (as in Linux), when the quantum size ap-
power

*b*). This is equivalent to saying:
proaches zero. A job's response time,

*T *, is the time from

*P *=

*m · b *=

*n · c*
when the job arrives until it has completed service, includ-ing waiting time. We aim to minimize

*mean response time*,
where

*m *and

*n *are less than or equal to

*k*. Obviously,

*m ≥ n*.

We will analyze our server farm model under both an open
4.2 Theorems for open loop configurations
loop configuration (See Section 4.2) and a closed loop con-
Theorem 1 derives the optimal power allocation in an open
figuration (See Section 4.3). An

*open loop configuration *is
loop configuration for a linear power-to-frequency relation-
one in which jobs arrive from outside the system and leave
ship, as is the case for DFS and DVFS. In such cases, the
the system after they complete service. We assume that
server frequency varies with the power allocated to it as
the arrival process is Poisson with average rate

*λ *jobs/sec.

*si *=

*sb *+

*α*(

*Pi − b*). The theorem says that if the speed
Sometimes it will be convenient to, instead, express

*λ *in
at baseline,

*sb*, is sufficiently low, then PowMax is optimal.

units of GHz. This conversion is easily achievable since an
By contrast, if

*sb *is high, then PowMin is optimal for high
average job has size

*E*[

*S*] gigacycles. In the theorems pre-
arrival rates and PowMax is optimal for low arrival rates. If
sented in the paper,

*λ *is in the units of GHz. However, in the

*s∗i *is the speed of server

*i *when run at power

*P∗i*, then the
appendix, when convenient for queueing analysis, we switch
stability condition requires that

*λ <*
to jobs/sec. Likewise, while it is common for us to express
Theorem 1.

*Given an open k-server farm configuration*
the speed of the server, s, in GHz, we sometimes switch

*with a linear power-to-frequency relationship (given by Eq. (1))*
to jobs/sec in the appendix when convenient. A

*closed loop*
*and power budget P, the following power allocation mini-*
*configuration *is one in which there are always a fixed number

*mizes E*[

*T *]

*:*
of users

*N *(also referred to as the multi-programming level)

*If sb ≤ α:*
who each submit one job to the server. Once a user's job is
1

*,*2

*,.,n *=

*c, P ∗*
*n*+1

*,n*+2

*,.,k *= 0
completed, he immediately creates another job, keeping the

*If sb > α:*
1

*,*2

*,.,n *=

*c, P ∗*
*n*+1

*,n*+2

*,.,k *= 0

*if λ ≤ λlow*
number of jobs constant at

*N *.

*P∗*1

*,*2

*,.,m *=

*b, P∗m*+1

*,m*+2

*,.,k *= 0

*if λ > λlow*
In all of the theorems that follow, we find the optimal

*where λlow *=

*α · P.*
power allocation, (

*P∗*1

*, P∗*2

*, . . , P∗k*), for a k-server farm, which
Corollary 1.

*For DFS, PowMax is optimal. For DVFS,*
minimizes the mean response time,

*E*[

*T *], given the fixed

*PowMax is optimal at low arrival rates and PowMin is op-*
peak power budget

*P *=

*i *. While deriving the op-

*timal at high arrival rates.*
Intuition For a linear power-to-frequency relationship,

*by Eqs. (1) and (2)), the following power allocation mini-*
we have from Eq. (1) that the speed of a server,

*si*, varies

*mizes E*[

*T *]

*for low N , based on the asymptotic approxima-*
with the power allocated to it,

*Pi*, as

*si *=

*sb *+

*α*(

*Pi − b*).

*tions in [14]:*
From this equation, it follows that the frequency per Wattfor a single server,

*si *, can be written as:

*P∗*1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0
=

*b − αb *+

*α.*
Corollary 3.

*For a closed-loop server farm configura-*
*tion with low N , PowMax is optimal for DFS, DVFS, and*
Hence, maximizing the frequency per Watt depends on whether

*sb ≤ αb *or

*sb > αb*. If

*sb ≤ αb*, maximizing

*si *is equivalent
to maximizing

*P*
Intuition When

*N *is sufficiently low, there are very few

*i*, which is achieved by PowMax. Alterna-
tively, if

*s*
jobs in the system. Hence few fast servers are optimal since

*b > αb*, we want to minimize

*Pi*, which is achieved
by PowMin. However, the above argument still does not take
there aren't enough jobs to utilize the servers, leaving servers
into account the mean arrival rate,

*λ*. If

*λ *is sufficiently low,
idle. Thus PowMax is optimal. This is similar to the case of
there are very few jobs in the server farm, hence, few fast
low arrival rate that we considered for an open loop server
servers, or PowMax, is optimal. The corollary follows by
farm configuration in Theorems 1 and 2.

simply plugging in the values of

*s*
When

*N *is high, the optimal power allocation is non-

*b*,

*α *and

*b *for DFS and
DVFS from Section 3.

trivial. From here on, we assume

*N *is large enough to keep
Theorem 2 derives the optimal power allocation for non-
all servers busy, so that asymptotic bounds apply (See [14]).

linear power-to-frequency relationships, such as the cubic
In our experiments, we find that

*N > *10

*k *suffices.

relationship in the case of DVFS+DFS. In such cases, the
Theorem 4 says that for high

*N *, if the speed at baseline,
server frequency varies with the power allocated to it as

*b*, is sufficiently low, then PowMax is optimal. By contrast,
if

*sb *is high, then PowMin is optimal.

*i *=

*sb *+

*α0 *3

*Pi − b*. The theorem says that if the arrival
rate is sufficiently low, then PowMax is optimal. However, if
Theorem 4.

*Given a closed k-server farm configuration*
the arrival rate is high, PowMed is optimal. Although The-

*with a linear power-to-frequency relationship (given by Eq. (1)),*
orem 2 specifies a cubic power-to-frequency relationship, we

*the following power allocation minimizes E*[

*T *]

*for high N ,*
conjecture that similar results hold for more general power-

*based on the asymptotic approximations in [14]:*
to-frequency curves where server frequency varies as the n-th

*If sb < α:*
root of the power allocated to the server.

1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0

*If sb ≥ α:*
1

*,*2

*,.,m *=

*b,*
*P∗m*+1

*,m*+2

*,.,k *= 0
Theorem 2.

*Given an open k-server farm configuration*
*with a cubic power-to-frequency relationship (given by Eq. (2))*
Corollary 4.

*For DFS, PowMax is optimal for high N .*
*and power budget P, the following power allocation mini-*
*For DVFS, PowMin is optimal for high N .*
*mizes E*[

*T *]

*:*
Intuition For a closed queueing system with zero think

*P∗*1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0

*if λ ≤ λ0low .*
time, the mean response time is inversely proportional to

*P∗*1

*,*2

*,.,l *=

*P ,*
*l*+1

*,l*+2

*,.,k *= 0

*if λ > λ0low*
% the throughput of the system. Hence, to minimize the mean
response time, we must maximize the throughput of the k-

*low *=

*nlα0*
*c − b − *3

*− b and l *=
server farm with power budget

*P*. When

*N *is high, under

*b*+

*α0P*
a server farm configuration, all servers are busy. Hence the
Corollary 2.

*For DVFS+DFS, PowMax is optimal at*
throughput is the sum of the server speeds. It can be easily

*low arrival rates and PowMed is optimal at high arrival*
shown that the throughput of the system under PowMin,

*sb ·m*, exceeds the throughput of the system under PowMax,

*sc · n*, when

*sb ≥ αb*. Hence the result.

Intuition When the arrival rate is sufficiently low, there
Theorem 5 deals with the case of high

*N *for a non-linear
are very few jobs in the system, hence, PowMax is optimal.

power-to-frequency relationship. The theorem says that if
However, for higher arrival rates, we allocate to each server
the speed at baseline,

*sb*, is sufficiently low, then PowMax is
the amount of power that maximizes its frequency per Watt
optimal. By contrast, if

*sb *is high, then PowMed is optimal.

ratio. For the cubic power-to-frequency relationship, whichhas a

*downwards concave *curve (See Fig. 2(b)), we find that
Theorem 5.

*Given a closed k-server farm configuration*
the optimal power allocation value for each server (derived

*with a cubic power-to-frequency relationship (given by Eq. (2)),*
via calculus) lies between the maximum

*c *and the minimum

*the following power allocation minimizes E*[

*T *]

*for high N ,*
*b*. Hence, PowMed is optimal.

*based on the asymptotic approximations in [14]:*
4.3 Theorems for closed loop configurations

*b < s0:*
*P∗*1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0

*If sb ≥ s0:*
*P∗*1

*,*2

*,.,l *=

*b *+

*x,*
*P∗l*+1

*,l*+2

*,.,k *= 0
We now move on to closed loop configurations, where the

*, s0 *=

*msc − α0 *3

*x and x is the non-*
number of jobs in the system,

*N *, is always constant. We
will rely on asymptotic operational laws (See [14]) which

*negative, real solution of the equation b *= 2

*x *+ 1

*α0 *(3

*x*
approximate the performance of the system for very high

*N*and very low

*N *(see Appendix).

Theorem 3 says that for a closed server farm configuration
Corollary 5.

*For DVFS+DFS, for high N , PowMed is*
with sufficiently low value of

*N *, PowMax is optimal.

*optimal if sb is high, else PowMax is optimal.*
Theorem 3.

*Given a closed k-server farm configuration*
Intuition As in the case of Theorem 4, we wish to maximize

*with a linear or cubic power-to-frequency relationship (given*
the throughput of the system. When we turn on a new
Mean response time (sec)
Mean response time (sec)
Mean arrival rate (jobs/sec) ®
Mean arrival rate (jobs/sec) ®
Figure 4:

*Open loop experimental results for mean response time as a function of the arrival rate using DFSand DVFS for LINPACK. In Fig.(a) PowMax outperforms PowMin for all arrival rates under DFS, by asmuch as a factor of 5. By contrast in Fig.(b), for DVFS, at lower arrival rates, PowMax outperforms PowMinby up to 22%, while at higher arrival rates, PowMin outperforms PowMax by up to 14%.*
server at

*b *units of power, the increase in throughput of the
system is

*s*
*b *GHz. For low values of

*sb*, this increase is small.

Hence, for low values of

*sb*, we wish to turn on as few servers
as possible. Thus, PowMax is optimal. However, when

*s*
is high, it pays to turn servers on. Once a server is on, theinitial steep increase in frequency per Watt afforded by a
cubic power-to-frequency relationship advocates running theserver at more than the minimal power

*b*. The exact optimal
PowMed power value (

*b *+

*x *in the Theorem) is close to theknee of the cubic power-to-frequency curve.

5. EXPERIMENTAL RESULTS
Mean response time (sec)
In this section we test our theoretical results from Sec-
tion 4 on an IBM BladeCenter using the experimental setup
Mean arrival rate (jobs/sec) ®
discussed in Section 2. We shall first present our experimen-tal results for the open server farm configuration and thenmove on to the closed server farm configuration. For the
Figure 5:

*Open loop experimental results for mean*
experiments in this section, we use the Intel LINPACK [13]

*response time as a function of the arrival rate using*
workload, which is CPU bound. We defer experimental re-

*DVFS+DFS for LINPACK. At lower arrival rates,*
sults for other workloads to Section 6.

*PowMax outperforms PowMed by up to 12%, while at*
As noted in Section 3, the baseline power level and the

*higher arrival rates, PowMed outperforms PowMax*
maximum power level for both DFS and DVFS are

*b *= 180W

*by up to 20%. Note that PowMin is worse than both*
and

*c *= 240W respectively. For DVFS+DFS,

*b *= 150W

*PowMed and PowMax throughout the range of arrival*
and

*c *= 250W. In each of our experiments, we try to fix the
power budget,

*P*, to be an integer multiple of

*b *and

*c*, as inEq. (3).

PowMin is huge; ranging from a factor of 3 at low arrival
5.1 Open server farm configuration
rates (load,

*ρ ≈ *0

*.*2) to as much as a factor of 5 at high
Fig. 4(a) plots the mean response time as a function of
arrival rates (load,

*ρ ≈ *0

*.*7). This is because the power-to-
the arrival rate for DFS with a power budget of

*P *= 720W.

frequency relationship for DFS is steep (See Fig. 2(a)), hence
In this case, PowMax (represented by the dashed line) de-
running servers at maximum power levels affords a huge gain
notes running 3 servers at

*c *= 240W and turning off all
in server frequency. Arrival rates higher than 0.22 jobs/sec
other servers. PowMin (represented by the solid line) de-
cause our systems to overload under PowMin because

*sb *is
notes running 4 servers at

*b *= 180W and turning off all other
very low for DFS. Hence, we only go as high as 0.22 jobs/sec.

servers. Clearly, PowMax outperforms PowMin throughout
Fig. 4(b) plots the mean response time as a function of the
the range of arrival rates. This is in agreement with the
arrival rate for DVFS with a power budget of

*P *= 720W.

predictions of Theorem 1. Note from Fig. 4(a) that the im-
PowMax (represented by the dashed line) again denotes run-
provement in mean response time afforded by PowMax over
ning 3 servers at

*c *= 240W and turning off all other servers.

Mean response time (sec)
Mean response time (sec)
Multi−programming level (N) ®
Multi−programming level (N) ®
Figure 6:

*Closed loop experimental results for mean response time as a function of number of jobs *(

*N *)

*in thesystem using DFS and DVFS for LINPACK. In Fig.(a), for DFS, PowMax outperforms PowMin for all valuesof N , by almost a factor of 2 throughout. By contrast in Fig.(b), for DVFS, at lower values of N , PowMax isslightly better than PowMin, while at higher values of N , PowMin outperforms PowMax by almost 30%.*
PowMin (represented by the solid line) denotes running 4
servers at

*b *= 180W and turning off all other servers. We see
that when the arrival rate is low, PowMax produces lower
mean response times than PowMin. In particular, when
the arrival rate is 0.5 jobs/sec, PowMax affords a 22% im-provement in mean response time over PowMin. However,
at higher arrival rates, PowMin outperforms PowMax, aspredicted by Theorem 1. In particular, when the arrivalrate is 1 job/sec, PowMin affords a 14% improvement inmean response time over PowMax. Under DVFS, we can
afford arrival rates up to 1 job/sec before overloading thesystem. To summarize, under DVFS, we see that PowMin
Mean response time (sec)
can be preferable to PowMax. This is due to the flatness ofthe power-to-frequency curve for DVFS (See Fig. 2(a)), and
agrees perfectly with Theorem 1.

Multi−programming level (N) ®
Fig. 5 plots the mean response time as a function of the
arrival rate for DVFS+DFS with a power budget of

*P *=1000W. In this case, PowMax (represented by the dashed
Figure 7:

*Closed loop experimental results for mean*
line) denotes running 4 servers at

*c *= 250W and turning

*response time as a function of number of jobs *(

*N *)
off all other servers. PowMed (represented by the solid line)

*in the system using DVFS+DFS for LINPACK. At*
denotes running 5 servers at

*b*+

*c *= 200W and turning off

*lower values of N , PowMed is slightly better than*
all other servers. We see that when the arrival rate is low,

*PowMax, while at higher values of N , PowMed out-*
PowMax produces lower mean response times than PowMed.

*performs PowMax, by as much as 40%. Note that*
However, at higher arrival rates, PowMed outperforms Pow-

*PowMin is worse than both PowMed and PowMax*
Max, exactly as predicted by Theorem 2. For the sake of

*for all values of N .*
completion, we also plot PowMin (dotted line in Fig. 5).

Note that PowMin is worse than both PowMed and Pow-Max throughout the range of arrival rates. Note that we
PowMax (represented by the dashed line) denotes running
use the value of

*b*+

*c *= 200

*W *as the optimal power allocated
3 servers at

*c *= 240W and turning off all other servers.

to each server in PowMed for our experiments as this value
PowMin (represented by the solid line) denotes running 4
is close to the theoretical optimum predicted by Theorem 2
servers at

*b *= 180W and turning off all other servers. Clearly,
(which is around 192W for the range of arrival rates we use)
PowMax outperforms PowMin throughout the range of

*N *,
and also helps to keep the power budget at 1000W .

by almost a factor of 2 throughout the range. This is inagreement with the predictions of Theorem 3.

5.2 Closed server farm configuration
Fig. 6(b) plots the mean response time as a function of
We now turn to our experimental results for closed server
the multi-programming level for DVFS with a power bud-
farm configurations. Fig. 6(a) plots the mean response time
get of

*P *= 720W. PowMax (represented by the dashed line)
as a function of the multi-programming level (MPL =

*N *)
again denotes running 3 servers at

*c *= 240W and turning
for DFS with a power budget of

*P *= 720W. In this case,
off all other servers. PowMin (represented by the solid line)
Figure 8:

*Power-to-frequency curves for DFS, DVFS, and DVFS+DFS for the CPU bound DAXPY workload.*

Fig.(a) illustrates our measurements for DFS and DVFS. In both these mechanisms, we see that the serverfrequency is linearly related to the power allocated to the server. Fig.(b) illustrates our measurements forDVFS+DFS, where the power-to-frequency curve is better approximated by a cubic relationship.
Mean response time (sec)
Mean response time (sec)
Mean response time (sec)
Mean arrival rate (jobs/sec) ®
Mean arrival rate (jobs/sec) ®
Mean arrival rate (jobs/sec) ®
Figure 9:

*Open loop experimental results for mean response time as a function of the arrival rate using DFS,DVFS, and DVFS+DFS for the CPU bound DAXPY workload. In Fig.(a), for DFS, PowMax outperformsPowMin throughout the range of arrival rates by as much as a factor of 5. In Fig.(b), for DVFS, PowMaxoutperforms PowMin throughout the range of arrival rates by around 30%. In Fig.(c), for DVFS+DFS, PowMaxoutperforms both PowMed and PowMin throughout the range of arrival rates. While at lower arrival ratesPowMax only slightly outperforms PowMed, at higher arrival rates the improvement is around 60%.*
denotes running 4 servers at

*b *= 180W and turning off all
predictions of Theorem 5. In particular, when

*N *= 100,
other servers. We see that when

*N *is high, PowMin pro-
PowMed affords a 40% improvement in mean response time
duces lower mean response times than PowMax. This is in
over PowMax. However, when

*N *is low, PowMed produces
agreement with the predictions of Theorem 4. In particular,
only slightly lower response times than PowMax. Note that
when

*N *= 100, PowMin affords a 30% improvement in mean
throughout the range of

*N *, PowMin is outperformed by both
response time over PowMax. However, when

*N *is low, Pow-
PowMax and PowMed.

Max produces slightly lower response times than PowMin.

This is in agreement with Theorem 3.

Fig. 7 plots the mean response time as a function of the
6. OTHER WORKLOADS
multi-programming level for DVFS+DFS with a power bud-
Thus far we have presented experimental results for a CPU
get of

*P *= 1000W. In this case, PowMax (represented by
bound workload LINPACK. In this section we present ex-
the dashed line) denotes running 4 servers at

*c *= 250W
perimental results for other workloads. Our experimental
and turning off all other servers. PowMed (represented by
results agree with our theoretical predictions

*even in the*
the solid line) denotes running 5 servers at

*b*+

*c *= 200W

*case of non-CPU bound workloads*.

and turning off all other servers. PowMin (represented by
We fully discuss experimental results for two workloads,
the dotted line) denotes running 6 servers at 170W. We
DAXPY and STREAM, in this section and summarize our
see that when

*N *is high, PowMed produces lower mean re-
results for other workloads at the end of the section. Due to
sponse times than PowMax. This is in agreement with the
lack of space, we only show results for open loop configura-
neck for STREAM's performance is the memory subsystem
since STREAM is memory bound. Thus, every extra Watt
DAXPY [22] is a CPU bound workload which we have sized
of power added to the system would mainly be used up by
to be L1 cache resident. This means DAXPY uses a lot
the memory subsystem and the improvement in processor
of processor and L1 cache but rarely uses the server mem-
frequency would be minimal.

ory and disk subsystems. Hence, the power-to-frequency
Figs. 11(a), (b) and (c) present our power allocation re-
relationship for DAXPY is similar to that of CPU bound
sults for STREAM under DFS, DVFS, and DVFS+DFS re-
LINPACK except that DAXPY's peak power consumption
spectively. Due to the downwards concave nature of the
tends to be lower than that of LINPACK, since DAXPY
power-to-frequency curves for STREAM studied in Fig. 10,
does not use a lot of memory or disk.

Theorem 2 says that PowMax should be optimal at low ar-
Figs. 8(a) and (b) present our results for the power-to-
rival rates and PowMed should be optimal at high arrival
frequency relationship for DAXPY. The functional form of
rates. However, for the values of

*α0 *in Fig. 10, we find that
the power-to-frequency relationship under DFS and DVFS
the threshold point,

*λlow*, below which PowMax is optimal,
in Fig. 8(a) is clearly linear. However, the power-to-frequency
is quite high. Hence, PowMax is optimal in Fig. 11(c). In
relationship under DVFS+DFS in Fig. 8(b) is better approx-
Figs. 11(a) and (b), PowMax and PowMed produce similar
imated by a cubic relationship. These trends are similar to
response times.

the power-to-frequency relationship for LINPACK seen in
GZIP and BZIP2 are common software applications used for
Figs. 9(a), (b) and (c) present our power allocation results
data compression in Unix systems. These CPU bound com-
for DAXPY under DFS, DVFS, and DVFS+DFS respec-
pression applications use sophisticated algorithms to reduce
tively. For DFS, in Fig. 9(a), PowMax outperforms PowMin
the size of a given file. We use GZIP and BZIP2 to compress
throughout the range of arrival rates, by as much as a fac-
a file of uncompressed size 150 MB. For GZIP, we find that
tor of 5. This is in agreement with Theorem 1. Note that
PowMax is optimal for all of DFS, DVFS, and DVFS+DFS.

we use 165W as the power allocated to each server under
These results are similar to the results for DAXPY. For
PowMin to keep the power budget same for PowMin and
BZIP2, the results are similar to those of LINPACK. In
PowMax. For DVFS, in Fig. 9(b), PowMax outperforms
particular, at low arrival rates, PowMax is optimal. For
PowMin throughout the range of arrival rates, by around
high arrival rates, PowMax is optimal for DFS, PowMin is
30%. This is in contrast to LINPACK, where PowMin out-
optimal for DVFS and PowMed is optimal for DVFS+DFS.

performs PowMax at high arrival rates. The reason why
PowMax outperforms PowMin for DAXPY is the lower value
WebBench [15] is a benchmark program used to measure
of

*sb *= 2.2 GHz for DAXPY as compared to

*sb *= 2.5 GHz
web server performance by sending multiple file requests to
for LINPACK. Since

*sb *= 0.0137

*< α *= 0

*.*014 for DAXPY
a server. For WebBench, we find the power-to-frequency
under DVFS, Theorem 1 rightly predicts PowMax to be op-
relationship for DFS, DVFS, and DVFS+DFS to be cubic.

timal. Finally, in Fig. 9(c) for DVFS+DFS, PowMax out-
This is similar to the power-to-frequency relationships ob-
performs both PowMed and PowMin throughout the range
served for STREAM since WebBench is more memory and
of arrival rates. Again, this is in contrast to LINPACK,
disk intensive. As theory predicts (See Theorem 2), we find
where PowMed outperforms PowMax at high arrival rates.

PowMax to be optimal at low arrival rates and PowMed
The reason why PowMax outperforms PowMed for DAXPY
to be optimal at high arrival rates for DFS, DVFS, and
is the higher value of

*α0 *= 0.46 GHz/ 3

*W *for DAXPY as
compared to

*α0 *= 0.39 GHz/ 3

*W *for LINPACK. This isin agreement with the predictions of Theorem 2 for high
values of

*α0*. Intuitively, for a cubic power-to-frequency re-
In this paper, we consider the problem of allocating an
lationship, we have from Eq. (2):

*s *=

*sb *+

*α0 *3

*P − b*. As
available power budget among servers in a server farm to

*α0 *increases, we get more server frequency for every Watt of
minimize mean response time. The amount of power allo-
power added to the server. Thus, at high

*α0*, we allocate as
cated to a server determines its speed in accordance to some
much power as possible to every server, implying PowMax.

power-to-frequency relationship. Hence, we begin by mea-
suring the power-to-frequency relationship within a single
STREAM [18] is a memory bound workload which does not
server. We experimentally find that the power-to-frequency
use a lot of processor cycles. Hence, the power consumption
relationship within a server for a given workload can be ei-
at a given server frequency for STREAM is usually lower
ther linear or cubic. Interestingly, we see that the relation-
than CPU bound LINPACK and DAXPY.

ship is linear for DFS and DVFS when the workload is CPU
Figs. 10(a) and (b) present our results for the power-
bound, but cubic when it is more memory bound. By con-
to-frequency relationship for STREAM. Surprisingly, the
trast, the relationship for DVFS+DFS is always cubic in our
functional form of the power-to-frequency relationship under
DFS, DVFS, and DVFS+DFS is closer to a

*cubic relation-*
Given the power-to-frequency relationship, we can view

*ship *than to a linear one. In particular, the gain in server
the problem of finding the optimal power allocation in terms
frequency per Watt at higher power allocations is much lower
of determining the optimal frequencies of servers in the server
than the gain in frequency per Watt at lower power alloca-
farm to minimize mean response time. However, there are
tions. We argue this observation as follows: At extremely
several factors apart from the server frequencies that affect
low server frequencies, the bottleneck for STREAM's perfor-
the mean response time for a server farm. These include
mance is the CPU. Thus, every extra Watt of power added
the arrival rate, the maximum speed of a server, the total
to the system would be used up by the CPU to improve its
power budget, whether the server farm has an open or closed
frequency. However, at higher server frequencies, the bottle-
configuration, etc. To fully understand the effects of these
Figure 10:

*Power-to-frequency curves for DFS, DVFS, and DVFS+DFS for the memory bound STREAMworkload. Fig.(a) illustrates our measurements for DFS and DVFS, while Fig.(b) illustrates our measurementsfor DVFS+DFS. In all the three mechanisms, the power-to-frequency curves are downwards concave, depictinga cubic relationship between power allocated to a server and its frequency.*
Mean response time (sec)
Mean response time (sec)
Mean response time (sec)
Mean arrival rate (jobs/sec) ®
Mean arrival rate (jobs/sec) ®
Mean arrival rate (jobs/sec) ®
Figure 11:

*Open loop experimental results for mean response time as a function of the arrival rate usingDFS, DVFS, and DVFS+DFS for the memory bound STREAM workload. In Figs.(a) and (b), for DFS andDVFS respectively, PowMed and PowMax produce similar response times. In Fig.(c) however, for DVFS+DFS,PowMax outperforms PowMed by as much as 30% at high arrival rates. In all three cases, PowMin is worsethan both PowMed and PowMax.*
factors on mean response time, we develop a queueing theo-
tion accordingly. The theorems in this paper already tell us
retic model (See Section 4.1) that allows us to predict mean
the optimal power allocation for any given arrival rate. We
response time as a function of the above factors. We then
are now working on incorporating the effects of switching
produce theorems (See Sections 4.2 and 4.3) that determine
costs into our model.

the optimal power allocation for every possible configuration
Second, while we have considered turning servers on or off,
of the above factors.

today's technology [2, 6] allows for servers which are

*sleep-*
To verify our theoretical predictions, we conduct extensive

*ing *(HALT state or deep C states). These sleeping servers
experiments on an IBM BladeCenter for a range of work-
consume less power than servers that are on, and can more
loads using DFS, DVFS, and DVFS+DFS (See Section 5
quickly be moved into the on state than servers that are
and 6). In every case, we find that the experimental results
turned off. We are looking at ways to extend our theorems
are in excellent agreement with our theoretical predictions.

to allow for servers with sleep states.

Third, while this paper deals with power management
at the server level (measuring and allocating power to the
8. DISCUSSION AND FUTURE WORK
server as a whole), our techniques can be extended to deal
There are many extensions to this work that we are ex-
with individual subsystems within a server, such as power
ploring, but are beyond the scope of this paper.

allocation within the storage subsystem. We are looking
First of all, the arrival rate into our server farm may vary
at extending our implementation to individual components
dynamically over time. In order to adjust to a dynamically
within a server.

varying arrival rate, we may need to adjust the power alloca-
[19] David Mosberger and Tai Jin. httperf—A Tool for
[1] Lesswatts.org: Race to idle.

Measuring Web Server Performance.

*ACM Sigmetrics:*
*Performance Evaluation Review*, 26:31–37, 1998.

[20] Vivek Pandey, W. Jiang, Y. Zhou, and R. Bianchini.

[2] Intel: Nehalem.

DMA-Aware Memory Energy Management.

*HPCA*
*'06: The 12th International Symposium on*
NGMS001/SF08 NGMS001 100t.pdf.

*High-Performance Computer Architecture*, pages
[3] U.S. Environmental Protection Agency. Epa report on
133–144, 11-15 Feb. 2006.

server and data center energy efficiency. 2007.

[21] Ramya Raghavendra, Parthasarathy Ranganathan,
[4] National Electrical Contractors Association. Data
Vanish Talwar, Zhikui Wang, and Xiaoyun Zhu. No
centers - meeting today's demand. 2007.

"Power" Struggles: Coordinated Multi-Level Power
[5] Jeffrey S. Chase, Darrell C. Anderson, Prachi N.

Management for the Data Center. In

*ASPLOS XIII:*
Thakar, and Amin M. Vahdat. Managing energy and

*Proceedings of the 13th international conference on*
server resources in hosting centers. In

*In Proceedings*
*Architectural support for programming languages and*
*of the Eighteenth ACM Symposium on Operating*
*operating systems*, pages 48–59, 2008.

*Systems Principles (SOSP)*, pages 103–116, 2001.

[22] K. Rajamani, H. Hanson, J. C. Rubio, S. Ghiasi, and
[6] Intel Corp. Intel Core2 Duo Mobile Processor
F. L. Rawson. Online power and performance
Datasheet: Table 20.

estimation for dynamic power management.

*Research*
*Report RC-24007*, July 2006.

32012001.pdf, 2008.

[23] Salvatore Sanfilippo. WBox HTTP testing tool
[7] M. Elnozahy, M. Kistler, and R. Rajamony. Energy
(Version 4). http://www.hping.org/wbox/, 2007.

conservation policies for web servers. In

*USITS*, 2003.

[24] X Wang and M Chen. Cluster-level Feedback Power
[8] Xiaobo Fan, Wolf-Dietrich Weber, and Luiz Andre
Control for Performance Optimization.

*14th IEEE*
Barroso. Power provisioning for a warehouse-sized

*International Symposium on High-Performance*
computer. pages 13–23, 2007.

*Computer Architecture (HPCA 2008)*, February 2008.

[9] Wes Felter, Karthick Rajamani, Tom Keller, and
[25] Zhikui Wang, Xiaoyun Zhu, Cliff McCarthy, Partha
Cosmin Rusu. A performance-conserving approach for
Ranganathan, and Vanish Talwar. Feedback Control
reducing peak power consumption in server systems.

Algorithms for Power Management of Servers. In
In

*ICS '05: Proceedings of the 19th annual*
*Third International Workshop on Feedback Control*
*International Conference on Supercomputing*, pages

*Implementation and Design in Computing Systems*
293–302, New York, NY, USA, 2005. ACM.

*and Networks (FeBid)*, Annapolis, MD, June 2008.

[10] Mark E. Femal and Vincent W. Freeh. Boosting Data
Center Performance Through Non-Uniform Power
Allocation. In

*ICAC '05: Proceedings of the SecondInternational Conference on Automatic Computing*,
A. PROOFS OF OPEN LOOP CONFIGURA-
pages 250–261, Washington, DC, 2005.

TION THEOREMS (SECTION 4.2)
[11] M. S. Floyd, S. Ghiasi, T. W. Keller, K. Rajamani,
In this appendix we provide brief sketches of the proofs of
F. L. Rawson, J. C. Rubio, and M. S. Ware. System
each theorem from the paper. All theorems, again, assume
Power Management Support in the IBM POWER6
Microprocessor.

*IBM Journal of Research andDevelopment*, 51:733–746, 2007.

Theorem 1.

*Given an open k-server farm configuration*
[12] Anshul Gandhi, Mor Harchol-Balter, Rajarshi Das,

*with a linear power-to-frequency relationship and power bud-*
and Charles Lefurgy. Optimal power allocation in

*get P, the following power allocation setting minimizes E*[

*T *]

*:*
server farms.

*Technical Report CMU-CS-09-113*, 2009.

*If sb ≤ α:*
1

*,*2

*,.,n *=

*c, P ∗*
*n*+1

*,n*+2

*,.,k *= 0
[13] Intel Corp. Intel Math Kernel Library 10.0 -

*If sb > α:*
1

*,*2

*,.,n *=

*c, P ∗*
*n*+1

*,n*+2

*,.,k *= 0

*if λ ≤ λlow*
*P∗*1

*,*2

*,.,m *=

*b, P∗m*+1

*,m*+2

*,.,k *= 0

*if λ > λlow*
*where λlow *=

*α · P.*
Proof. We provide a complete proof for the case of

*k *= 2
[14] Raj Jain.

*The Art of Computer Systems Performance*
in Lemma 1. We then use this result to prove the theorem

*Analysis: techniques for experimental design,*
for the case of arbitrary

*k *by contradiction.

*measurement, simulation, and modeling*. pages563–567. Wiley, 1991.

Lemma 1.

*Given an open 2-server farm configuration with*
[15] Radim Kolar. Web bench.

*linear power-to-speed relationship and power budget P, the*
*following power allocation minimizes E[T]:*
[16] Kleinrock L.

*Queueing Systems, Volume 2*.

*If sb ≤ α:*
1 = min

*{c, P } , P*2 =

*P − P ∗*
Wiley-Interscience, New York, 1976.

*P∗*1 = min

*{c, P} , P∗*2 =

*P − P∗*1

*if λ ≤ λlow*
[17] Charles Lefurgy, Xiaorui Wang, and Malcolm Ware.

*If sb > α:*
1 =

*b, P ∗*
2 =

*P − b if *2

*b ≤ P ≤ b *+

*c *&

*λ > λlow*
Power capping: a prelude to power shifting.

*Cluster*
*P∗*1 = min

*{c, P} P∗*2 = min

*{c, P − P∗*1

*} otherwise*
*Computing*, November 2007.

*where λlow *=

*sc*+

*α *(

*c − b*)(

*c *+

*b − P*)

*− sb*(

*sb *+

*α*(

*P − *2

*b*))

*.*
[18] J.D. McCalpin. Stream: Sustainable memory
bandwidth in high performance computers.

http://www.cs.virginia.edu/stream/.

Proof. [Lemma 1] The proof is trivial for the cases

*P <*
2

*b *and

*P ≥ *2

*c*. Thus, assume 2

*b ≤ P < *2

*c*. Let

*P *be split
two servers, say servers

*i *and

*j*, such that

*b ≤ Pi < c *and
among the two servers as (

*P*1

*, P*2).

*b ≤ Pj < c*. We invoke Lemma 1 on servers

*i *and

*j *with
The goal is to find the optimal power allocation that min-
total power

*P0 *=

*Pi *+

*Pj*. Lemma 1 tells us that the optimal
imizes

*E*[

*T *] given the constraint:
power allocation for servers

*i *and

*j *is

*P∗i *= min

*{c, P0} *and

*j *=

*P 0 − P ∗*
*i *. It can also be shown (see [12]) that reducing
the mean response time of the 2-server system consisting
Assuming a linear power-to-frequency relationship, we have
of servers

*i *and

*j *reduces the mean response time of the
the following relationship for the speed of server

*i*,

*s*
whole system. Thus, we have shown that the mean response
function of the power allocated to it,

*P*
time of the system under

*π *can reduced. Thus, PowMax is

*si *=

*sb *+

*α*(

*Pi − b*),
The case of

*sb > α *is more complex. We want to show
where

*α*,

*s*
that the servers that are turned on should either all run at

*b *and

*b *are as defined in Sections 2.3 and 3.

We now wish to derive

*E*[

*T *] for a 2-server farm with speeds
power

*c *or all run at power

*b*, depending on

*λ*. We start
with an arbitrary power allocation and show that we can
1 and

*s*2. Recall from Section 4.1 that we have a PS schedul-
ing discipline and a Poisson arrival process with some mean
repeatedly apply Lemma 1 to two servers at a time to end

*λ*. It is well-known [16] that for a single M/G/1/PS queue
up in a power allocation where we have some servers at
with speed

*s*, Poisson arrival rate

*λ *and general job size,
power

*b*, some servers at power

*c *and at most one server at

*E*[

*T *] is as follows:
some intermediate power. At this point, we optimize overall power allocation settings of the above type, and find that

*E*[

*T *]

*M/G/*1

*/P S*
*E*[

*T *] is minimized in the particular power allocation where
either

*n *servers are at power

*c *(for low

*λ*) or

*m *servers are
By exploiting Poisson splitting in our model, we have that
at power

*b *(for high

*λ*). For details, see [12].

the mean response time of jobs in the server farm with split-
Theorem 2.

*Given an open k-server farm configuration*
ting parameter

*q *(where

*q *is the fraction of jobs sent to the

*with a cubic power-to-frequency relationship (as in Eq. (2))*
1st server) and server speeds

*s*1 and

*s*2 is:

*and power budget P, the following power allocation setting*
*minimizes E*[

*T *]

*:*
*E*[

*T *] =

*s*2

*− λ · *(1

*− q*)
1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0

*if λ ≤ λ0low .*
given the stability constraints:
1

*,*2

*,.,l *=

*P*
*l*+1

*,l*+2

*,.,k *= 0

*if λ > λ0low*
*c − b − *3

*− b and l *=
1

*> λq*
*s*2

*> λ*(1

*− q*)

*low *=

*nlα0*
*b*+

*α0P*
From Eq. (6) we see that

*E*[

*T *] is a function of

*s*1,

*s*2 and

*q*.

However, using Eqs. (4) and (5), we can express

*s*
Proof. We follow the same process as in the proof of
Thm. 1. Staring with the case of two servers, we note that
Eq. (9) is still valid in the case of a cubic power-to-frequency

*s*2 = 2

*sb *+

*α*(

*P − *2

*b*)

*− s*1
relationship. However, Eq. (8) now takes the form:
We now derive the optimal value of

*q*, which we call

*q∗*. To

*s*2 =

*sb *+ 3

*α0*3(

*P − *2

*b*)

*− *(

*s*1

*− sb*)3
do this, we fix the speeds

*s*1 and

*s*2 in Eq. (6) and set thederivative of

*E*[

*T *] w.r.t.

*q *to be 0. This yields:
where

*α0*,

*sb *and

*b *are as defined in Section 2.3. We can
now express

*E*[

*T *] (from Eq. (10)) in terms of just

*s*1 by
substituting for

*q *and

*s*2 from Eqs. (9) and (11). Hence

*λ*(

*s*1 +

*s*2)
we can differentiate

*E*[

*T *] w.r.t.

*s*1 to yield the optimal

*s*1,
Using

*q∗ *in the expression for

*E*[

*T *] from Eq. (6) gives us:
which translates to

*P∗*1. We find that, for the 2-server case,
PowMax is optimal at low arrival rates, whereas PowMed

*λ *+ 2

*s*
is optimal at high arrival rates. Note that PowMed for two

*E*[

*T *] =
servers refers to

*P∗*
if

*P ≥ *2

*b *and
1 +

*s*2

*− λ*)
=

*P∗*2 = min

*P∗*1 =

*P, P∗*2 = 0 otherwise.

At this point, we can express

*E*[

*T *] in terms of just

*s*1 by
We now return to the proof of Thm. 2. When there are
substituting for

*s*2 from Eq. (8). Hence we can differentiate

*k > *2 servers, we want to show that we should either run

*n*
*E*[

*T *] w.r.t.

*s*1 to yield the optimal

*s*1, which translates via
servers at power

*c *or run

*l *servers at power

*P *, depending
Eq. (5) to

*P∗*
1 . We find that for

*sb ≤ α*, PowMax minimizes
on

*λ*. We start with an arbitrary power allocation and show

*E*[

*T *], where PowMax for 2 servers refers to

*P∗*1 = min

*{c, P}*
that we can repeatedly apply the two server result discussed
and

*P∗*2 =

*P − P*1. For

*sb > α*, we find that PowMax
above to pairs of servers at a time to end up in a power
minimizes

*E*[

*T *] for

*λ ≤ λlow*, whereas PowMin minimizes
allocation where we have some

*v *servers at power

*c *and some

*E*[

*T *] for

*λ > λlow*. PowMin for 2 servers refers to

*P∗*1 =

*w *servers at power

*P−vc *. At this point, we optimize over
2 =

*P − b *if 2

*b ≤ P ≤ b *+

*c *and

*P ∗*
1 = min

*{c, P } , P ∗*
all power allocation settings of the above type, and find that
min

*{c, P − P∗*1

*} *otherwise.

*E*[

*T *] is minimized in the particular power allocation whereeither

*n *servers are at power

*c *(for low

*λ*) or

*l *servers are at
We now return to the proof of Thm. 1. We first consider
power

*P *(for high

*λ*). For details, see [12].

the case

*sb ≤ α*. The proof proceeds by contradiction. We
claim that PowMax is optimal. Thus, assume we have apower allocation

*π *= (

*P*
B. PROOFS OF CLOSED LOOP CONFIG-
1

*, . . , Pk*), which is not the same as
PowMax. Since

*π *is not PowMax, there must exist at least
URATION THEOREMS (SECTION 4.3)
where

*α*,

*sb *and

*b *are as defined in Section 2.3. Thus we
For closed loop configurations, [14] provides well-known
asymptotic bounds for the mean response time both in the
case where N, the number of jobs in the system, is very high
(

*sb *+

*α*(

*Pi − b*))
and in the case where N is very low. These bounds provide
excellent approximations for

*E*[

*T *] . Here, N is considered

*high *if it significantly exceeds the number of servers that are
=

*j*(

*sb − αb*) +

*α*
on, and N is considered

*low *if it is close to 1. We will use
these approximations from [14] to derive the optimal power
=

*j*(

*s*
allocations in Theorem 3 (low N) and in Theorems 4 and 5

*b − αb*) +

*αP*
Hence, if

*sb ≥ α*,

*E*[

*T *] is minimized by maximizing

*j*. Thus
PowMin is optimal for

*sb ≥ α*. When

*sb < α*,

*E*[

*T *] is
Theorem 3.

*Given a closed k-server farm configuration*
minimized by minimizing

*j*. Thus PowMax is optimal for

*with a linear or cubic power-to-frequency relationship, the*
*sb < α*.

*following power allocation minimizes E*[

*T *]

*for low N , based*
*on the asymptotic approximations in [14]:*
Theorem 5.

*Given a closed k-server farm configuration*
*with a cubic power-to-frequency relationship, the following*
*P∗*1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0

*power allocation minimizes E*[

*T *]

*for high N , based on theasymptotic approximations in [14]:*
Proof. We start by assuming the number of jobs in the

*If sb < s0:*
system,

*N *, is low. From [14] we have that:
1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0

*If sb ≥ s0:*
*P∗*1

*,*2

*,.,l *=

*b *+

*x,*
*P∗l*+1

*,l*+2

*,.,k *= 0

*, s0 *=

*msc − α0 *3

*x and x is the non-*
*E*[

*T *]

*≈*
*negative solution of the equation b *= 2

*x *+ 1

*α0 *(3

*x*
Without loss of generality, assume

*s*1

*≥ s*2

*≥ . . ≥ sk*.

Proof. As in the proof for Theorem 4, minimizing

*E*[

*T *]
This implies 1

*≤ *1

*≤ . . ≤ *1 . Thus,

*E*[

*T *] given by
is equivalent to maximizing

*i*, where

*j ≤ k *is the
Eq. (12) is minimized by setting

*q*1 = 1 if

*s*1

*> s*2, or

*q*1 =
number of servers that are turned on. For a cubic power-to-

*. . *=

*qa *= 1 if

*s*
frequency relationship given by Eq. (2), we have:
1 =

*. . *=

*sa *for some integer

*a*. If

*s*1

*> s*2,
we want to minimize

*E*[

*T *] = 1 . Thus PowMax is optimal
in this case. For the case of

*s*
*i *=

*sb *+

*α0 *3
1 =

*. . *=

*sa *for some integer

*a*, we want to minimize

*E*[

*T *] = 1 , for

*i *= 1

*, *2

*, . . , a*. This
where

*α0*,

*s*
*b *and

*b *are as defined in Section 2.3. Thus we
is achieved by maximizing

*s*
*i*. Clearly, by setting

*a *=

*P *,

*s*
is maximized by

*P∗i *=

*c *for

*i *= 1

*, *2

*, . . , a*. Hence, for low

*N *, PowMax is optimal.

*sb *+

*α0 *3

*Pi − b*
Theorem 4.

*Given a closed k-server farm configuration*
*with a linear power-to-frequency relationship , the following*
*b *+

*α0*
*power allocation minimizes E*[

*T *]

*for high N , based on the*
*asymptotic approximations in [14]:*
*If sb < α:*
Since the sum of

*Pi*'s is a constant, the sum of their cube
1

*,*2

*,.,n *=

*c,*
*P∗n*+1

*,n*+2

*,.,k *= 0

*If sb ≥ α:*
roots attains its maximum value when all the

*Pi*'s are equal.

1

*,*2

*,.,m *=

*b,*
*P∗m*+1

*,m*+2

*,.,k *= 0
This follows from the generalized mean inequality. Thus,
Proof. Assuming that N is high, from [14] we have that:
we wish to maximize

*j sb *+

*α0 *3

*− b *. By looking at

*E*[

*T *]

*≈ N · max*
1

*, *2

*, . . , k*
the derivative of

*si*, we find the power allocation that
maximizes the sum of speeds, and thus minimizes the mean
Without loss of generality, assume

*q*1

*≥ q*2

*≥ . . ≥ qj *,
response time. We find that PowMax is optimal for low
values of

*s*
where

*j ≤ k *denotes the number of servers that are turned

*b *whereas PowMed is optimal for high values of
on. Thus,

*E*[

*T *] =

*N · q*1 from Eq. (13). Since

*N *is a

*b*. Hence the result.

constant, minimizing

*E*[

*T *] is now equivalent to minimizing

*q*1 , which attains the lowest value of

*q*2 when

*q*1 =

*q*2 .

However

*q*2 itself attains the lowest value of

*q*3 when

*q*2 =

*q*3 and so on. Thus, to minimize

*E*[

*T *], we must set

*q*1 =

*q*2 =

*. . *=

*qj *=

*r*, say. Since

*i *= 1, we have:

*E*[

*T *] =

*N · *1 =

*N · r *= P
Thus, minimizing

*E*[

*T *] is equivalent to maximizing
For a linear power-to-frequency relationship given by Eq. (5),we have:

*si *=

*sb *+

*α*(

*Pi − b*)

*,*
Source: http://www.cs.zju.edu.cn/people/yedeshi/seminar09/Sigmetrics09power.pdf

Cell Research (2012) 22:107-126. © 2012 IBCB, SIBS, CAS All rights reserved 1001-0602/12 $ 32.00 Exogenous plant MIR168a specifically targets mammalian LDLRAP1: evidence of cross-kingdom regulation by Lin Zhang1, *, Dongxia Hou1, *, Xi Chen1, *, Donghai Li1, *, Lingyun Zhu1, 2, Yujing Zhang1, Jing Li1, Zhen Bian1, Xiangying Liang1, Xing Cai1, Yuan Yin1, Cheng Wang1, Tianfu Zhang1, Dihan Zhu1, Dianmu Zhang1, Jie Xu1,

Physica A 352 (2005) 113–130 A cell-centered approach to developmental biology Roeland M.H. Merks, James A. Glazier Department of Physics, Biocomplexity Institute, Indiana University, Swain Hall West 159, 727 East 3rd Street, Bloomington, IN 47405-7105, USA Available online 13 January 2005 Explaining embryonic development of multicellular organisms requires insight into complex