Objective
Let us practice a little bit with confidence intervals (CI) and
uncertainty estimates.
We are going to deal with the system usage case, a critical
concept for sizers. For example, it is the main magnitude used to evaluate the
base point and propose the right upgrade to the current server. There are many
ways to define and measure the system usage, but for the purpose of this blog
entry it doesn’t matter the particular choice. I’ll call the chosen magnitude generically as
the peak usage (PU).
ABC of the Confidence Interval (CI)
You should review the “The importance of what it is not
said” in this blog for detailed explanations, but here are the basic facts:
The CI express the uncertainty or probable error you commit
when estimate the value or certain magnitude (PU in our case). It is reported in
the following format
CI Center ± CI Length
where
- CI Center = AVERAGE(data), the arithmetic average of the data (the measured values)
- CI Length = 2* T.INV.2T(1-conf, size-1)*STDEV.S(data)/SQRT(size)
and
- data is the sample, or set of measured data points,
- size is the sample size, the number of data points,
- conf is the desired confidence level (CL),
- T.INV.2T(), STDEV.S() and SQRT() are worksheet (Excel) functions (1).
The CI length -the
urcentainty / imprecision- depends on, and varies with, the following factors:
- Decreases with the sample size,
- Increases with the sample variability/dispersion,
- Increases with the confidence level.
Analysis of Samples
Scenario: a typical OLTP service, in which the peak hour is
always from 11 to 12 a.m. Due to the inherent randomness of the system –in customers
arrivals, in service times- the height of the peak, the PU, its maximum value,
varies from one day to another. One day the maximum is 68%, other day 72%,
other 62%, and so on. You have compiled a series of 20 days of daily PU, as
showed in figure 1. Of course, you have
conveniently revised the points to be sure, beyond any reasonable doubt, of
their homogeneity and representativeness (i.e. they are values for normal workdays,
and holidays have been removed).
Fig. 1: Daily peak
usage for the low variability (LoVar) case. Sample size=20, sample average=70.5%,
sample standard deviation=3.7%.
The CI interval for the average peak usage is 70.5 ±1.5% for
a confidence level of 90%. This essentially means that there is a scarce 5%
probability that the all-the-days peak usage average (2) lies above 72%,
the CI upper limit (UL).
Of course we are speaking statistically: it is possible that
values in figure 1 happen to be abnormally low, as there is a 5% probability
for this to happen. If you want to minimize this probability, increase the
confidence level, and the CI length increases as well. With 98% CL the
probability goes down to 1% (an average of one of every 100 20-day samples!).
More often than not we don’t have the luxury of large size
samples. For example, you’ve been asked to size some system and the sysadmin
does not record the PU or, at least, not in the way you want. After giving the
directions of what you need and how to measure it, you get 3 days of PU.
Suppose that these three days are the latest in the series (days 18, 19 and
20). For these three days the CI interval for the average peak usage is 73.0 ±9.0%
with CL=90%. Henceforth there is a 5% probability that the all-the-days peak
usage average lies above 82.0%. Clearly the CI length increases and its upper
limit is higher. Reason: with only three samples our knowledge about the PU
behavior is very poor and the safety margin grows.
Summarizing:
- LoVar system PU. 20 days. CL=90%. CI: 70.5 ±1.5%. UL: 72%.
- LoVar system PU. 3 days. CL=90%. CI: 73.0 ±9.0%. UL: 82%.
Suppose you have the PU daily measurements for another, more
erratic system, called HiVar. See figure 2.
Fig. 2: Peak usage
for the high variability (HiVar) case. Sample size=20, sample average=68.0%,
sample standard deviation=10.5%.
The CI interval for the average peak usage is 68.0 ±4.0% for
a CL of 90%. And if we take only the last three days, like in the previous
case, the CI interval for the average peak usage is 68.0 ±18.0% for a CL of
90%.
Summarizing:
- HiVar system PU. 20 days. CL=90%. CI: 68.0 ±4.0%. UL: 72%.
- HiVar system PU. 3 days. CL=90%. CI: 68.0 ±18.0%. UL: 86%.
Recommended Sample Size
Based on the CI framework we can recommend what the sample
size -the number of measured days- should be if you want to keep the
uncertainty / imprecision below some threshold. For the particular and
reasonable case of CI length under 10% for a CL=90% we find
- If the data values show low variability (sample standard deviation around 5%) you must measure at least 5 days.
- If the data values show medium variability (sample standard deviation around10%) you must measure at least 13 days.
- If the data values show high variability (sample standard deviation around 20%) you must measure at least 46 days!! In such a case we should relax the precision objetive and allow for a wider CI.
We can extend and summarize the above results in the
following table:
Variability
|
|||
Length
|
Low
|
Medium
|
High
|
5
|
13
|
46
|
176
|
10
|
5
|
13
|
46
|
20
|
3
|
5
|
13
|
Table 1: Minimum
sample size as a function of the CI length and the sample variability (Low is
SD=5%, Medium is SD=10% and High is SD=20%), for CL=90%. SD stands for standard
deviation.
Or express them in the opposite way:
Variability
|
|||
size
|
Low
|
Medium
|
High
|
5
|
10
|
19
|
38
|
10
|
6
|
12
|
23
|
20
|
4
|
8
|
15
|
Table 2: CI length as
a function of the sample size and the sample variability (Low is SD=5%, Medium
is SD=10% and High is SD=20%), for CL=90%.
Final Words
- You must have a sufficient number of data points. In the sizing world 5 data points should be the absolute minimum.
- If you don’t have the luxury of large samples, refrain from making bold predictions and use the CI framework to evaluate plausible expectations.
Notes
(1) T.INV()
is the Student probability distribution, STDEV() the standard deviation, and
SQRT() the square root.
(2) The
value we would have obtained if we had used all the days, instead of a sample
(subset) of them.
Cap comentari:
Publica un comentari a l'entrada