After reading
“Better 1x10 than 10x1" you're convinced you need a big server. Then you
go to a big enterprise computer shop and ask for one system with “10” units of processing
capacity. The seller shows you two different models: one called "10x1", equipped with 10 processors
of relative “speed” equal to 1. The other one, called "1x10", equipped with only 1
processor but 10 times faster. Both systems have a certificate that indicates
they have 10 units of capacity. And both server prices are equal. Which one would you choose?
First of all,
you must be aware that the most used measure to express the capacity of any
server, and in general of any service center whatever it be, is the maximum units
of work (UOW) per unit of time (UOT) it can service (Footnote 1). This is the
maximum throughput it supports, also called the bandwidth. In our particular
case, both servers have the same value: 10 UOW/UOT. So this is not going to
help us in our election of the server. Know the maximum throughput is necessary,
but not sufficient.
To find the
answer let’s switch to another point of view: we move away from the overall
behavior, represented by the benchmarked maximum throughput, to the individual
perception of performance, represented by the response time. Does a particular
user experience the “same” performance in both servers?
Suppose that 1
customer (requesting 1 UOW) arrives to an empty (100% free) server. In such a
circumstance, the customer sees not queue ahead, does not have to wait, and
receives immediate service. In the 1x10-server case it spends, let's say, 10
units of time, receiving service (this time is called the service time S). In the 10x1-server case it spends 1 unit of time,
just because the processor speed is 10 times faster.
This makes a
difference, doesn't it? In the low load zone, where queuing seldom occurs, the 10x1-server
shows an obvious advantage over the 1x10-server, due to the increased speed of
its processor.
In the high
load zone, there are a stream of users (UOWs) arriving and long queues build up.
The user has to patiently wait for the queue to advance until it is his turn. So
a user request spend much more time waiting to be serviced than being serviced.
The time spent in queue, waiting to be serviced, is called the waiting time (W). This waiting time, being much greater than
the service time, dominates the user experience.
The waiting
time is essentially the same for the two kind of servers. If you arrive to a
waiting line and see 1000 customers waiting ahead of you, you’ll have to wait
the same if the queue advances 10 customers every 10 units of time, as is the
case for the 10x1-server, than if the queue advances 1 customer every 1 unit of
time, as is the case of the 1x10-server. This means that in the high load zone
both type of servers essentially deliver the same individual user performance.
Summarizing:
in the low load zone the 10x1-server is the winner, and moving away into the
high load zone the 10x1-server advantage gets blurred. And at the end, where
the very high load zone is reached, both servers are equally performers.
You've now a
good reason to prefer one kind of server, the 10x1-server, to the other, the 1x10-server:
the response time will be better in the low load zone. This may or may not be
significant in your particular case of interest. It should be more noticeable
in long running not parallelizable tasks than in short running ones, but the
fact is there and you should be aware of it.
Server vendors
typically put a premium price on 10x1-servers. I would like to think that this
increase in price is because of its better performance, but I suppose that the
real reason relies on higher research & development, and production costs
of faster processors and components.
A final word
here: in the previous “Better 1x10 than 10x1" blog entry we exposed a
reason to prefer bigger servers: the random nature of user arrivals or demands.
And here we've seen that it is better to achieve the big capacity with less and
faster processors. The pattern “10x1 better than 1x10” repeats, but from a
different point of view.
Think about
it.
Footnote:
(1) Units of work (UOW) and units of time (UOT) are
generic units. In the computer system world they can express, i.e.,
transactions per minute or tpm (UOW=transaction, UOT=minute), IOPS (UOW=IO
operation, UOW=second), and similar. But it is generally applicable to other
areas: public service desks (UOW=certain customer request, UOT=minute), highway
tolls (UOW=cars passing & paying, UOT=minute), and so on. The only limit is
our imagination.
Cap comentari:
Publica un comentari a l'entrada