Demystifying Performance

I've always felt attracted to computer performance subjects: throughput, response time, sizing, bottlenecks, usage, capacity. I've not always seen clear, understandable and knowledgeable explanations on these subjects. With the arrival of the virtualization even more complexity is introduced: terms that represented constant values, like the machine capacity, gets blurred. What can be said about a virtual machine capacity if it is no longer a fixed value and may change from this hour to the next?

Es mostren els missatges amb l'etiqueta de comentaris virtual_capacity. Mostrar tots els missatges

dilluns, 15 de febrer del 2016

How much capacity does a virtual cpu guarantee?

The rapid answer to the question how much capacity does a virtual cpu guarantee? is as much as one core can deliver (PowerVM VP), or as much as one thread can deliver (ESX vCPU). This is the best case, and so has been In the two previous entries ( “Don’t put in the same bag Xeon and POWER virtual CPUs”, “More on ESX vCPU versus PowerVM VP” ). But the best case is not necessarily the most common case.

Two typical non-best cases in the real world:

For a good technical reason, trying to leverage the “sharing” capability that virtualization technologies enable,
For a bad economic reason, reselling the same underlying physical capacity to more than one customer.

Independently of the true reason, you should be aware of a parameter, assigned to a VM, that helps a lot in specifying how much capacity does your particular VP (vCPU) guarantee. It’s called Entitlement in PowerVM, and Reservation in ESX. These parameters have a very interesting property: they cannot be overcommitted, that is, you cannot distribute among the VMs more Entitled (Reserved) capacity than is available in the physical machine. On the contrary, you can create and distribute more VPs (vCPUs) than existing cores (threads). If you simply divide the VM Entitlement (Reservation) by the number of VP (vCPU) in the VM you will have how much a virtual cpu guarantees.

Let us illustrate this with a very simple scenario, a reasonable setup for two VMs, the Red VM and the Blue VM, with the same importance. This situation may arise, i.e., when two productive environments share the same PM.

Physical Machine	IBM POWER S824 2s/24c/192t POWER8 @3.52GHz
Cores	24
SAPS	115870
SAPS/core	4828

Red VM		Blue VM
12	Entitlement (cores)	12
24	Virtual Processors	24
Uncapped	Cap/Uncap	Uncapped

The reason for 24 VP per VM, instead of 12 VP, is for the VM to be able to reach the 100% capacity of the PM.

In the best case (for the Red VM) the Blue VM remains idle, and under such a circumstance the 24 Red VPs can use the 24 cores, giving the equivalence 1 VP = 1 core = 4280 SAPS. To simplify we are not taking into account capacity reductions due to virtualization.

In the worst case (for the Red VM), the Blue VM is fully loaded and then the 24 Red VPs can use 12 cores at most, the Entitlement, giving the equivalence 1 VP = 0.5 core = 2140 SAPS.

So the actual Red VP capacity will be somewhere in the interval [2140, 4280] SAPS, depending on the Blue VM usage. Never will be less than 2140 SAPS, the guaranteed or worst case. Never will be more than 4280 SAPS, the underlying core capacity.

By the way, this is a good illustration of the good technical reason for overcommitting VP (vCPU) mentioned above. But may also be an example of the bad economic reason: just imagine that the owner of the physical machine sells the 24 VPs to the red customer, and 24 VPs to the blue customer, promising 1 VP = 4280 SAPS to everyone!

Summarizing

If you only know the number of VP (vCPU) that your VM has been assigned, you don’t have enough information to establish its precise capacity point. At least you should be informed of its Entitlement (Reservation) to derive the guaranteed capacity.

	PowerVM	ESX
Best Case	1 VP = 1 core	1 vCPU = 1 thread
Guaranteed / Worst Case	1 VP = Entitlement/NumberofVP	1 vCPU = Reservation/NumberofvCPU

Mirror: https://www-304.ibm.com/connections/blogs/performance/entry/how_much_capacity_does_a_virtual_cpu_guarantee?lang=en_us

dilluns, 11 de gener del 2016

Don’t put in the same bag Xeon and POWER virtual CPUs

Just a reminder: this blog is a mirror of the "main" site https://www-304.ibm.com/connections/blogs/performance (http://ibm.biz/demystperf)

In the pervasive virtual world the standard unit of performance capacity happens to be the virtual CPU (vCPU). “This virtual machine (VM) has 6 vCPUs”, “you will have to provide 8 vCPUs for that VM”, and the like are common sentences. It could be a reasonable metric of performance if the underlying physical CPUs and the hypervisor layer were all the same. But this is seldom the case: Intel Xeon processors combined with VMware ESX virtualization, and IBM POWER processors combined with POWERVM virtualization are very different beasts.

If you are in need to size or convert capacity between these dissimilar systems, or in need to have a solid comparison base, or would like to unmask tricks and pitfalls that plague the virtual world sizing, continue reading.

IBM POWERVM

The POWERVM term for vCPU is Virtual Processor (VP). The VM has, or sees, VPs. And these VPs are scheduled, in a time-shared manner, on POWER cores. Yes, read it again: one VP is scheduled on one core. I remark this because in the ESX / Intel world this is different, as you will see later.

Given this VP-to-core mapping, the VP capacity ranges between two values:

In the best case the capacity of one VP is the capacity that one core can deliver, that is 1 VP is 1 core
In the worst case 1 VP is (1/10)th of core.

The actual VP capacity depends on the following factors (revisit “Why is the Virtual Capacity so Important?” and “The Playground of the Virtual Capacity” in this blog for a detailed explanation):

configuration parameters of the VM the VP belongs to (entitlement, capped/uncapped attribute, uncapped weight)
configuration parameters of all the other VMs sharing the same physical machine (PM)
actual usage of capacity from all the other VMs sharing the same physical machine

Given this highly variable value, contrary to what one unit of measure must be, how have VPs been promoted to be a “standard” measure of capacity? Amazing, don’t you think so?

VMWARE ESX

The VM has, or sees, vCPUs. And those vCPUs are scheduled on Intel processor threads, in a time shared manner. The mapping is vCPU-to-thread, and is different than the POWER / POWERVM case (VP-to-core).

Given this vCPU-to-thread mapping, the vCPU capacity ranges between two values:

In the best case the capacity of one vCPU is the capacity that one thread can deliver, that is 1 vCPU is 1 thread
In the worst case one vCPU is very small (I’m not aware of a low limit)

The actual vCPU capacity depends on the same factors described in the POWERVM case,

configuration parameters of the VM the VP belongs to
configuration parameters of all the other VMs sharing the same PM
actual usage of capacity from all the other VMs sharing the same PM.

Benchmarking vCPUs

The reputation of the vCPU as a stable capacity unit of measure has been destroyed. A vCPU capacity can range from a full core (or to a full thread) to a small fraction, and even depends on alien factors (from other VMs)!

Is there a way to put some sense in this nihilism?

Yes, it is. By taking a practical approach: use the best case values. You know that the actual performance will always be equal or worse than that, but we have to live with this.

To evaluate the best case let’s consider the two systems we analyzed in SAPS Olympics: single thread performance post in this blog.

Dell PowerEdge R730 2s/36c/72t Intel Xeon E5-2699 v3 @2.30 GHz	Physical System	IBM POWER S824 2s/24c/192t POWER8 @3.52GHz
36	Cores	24
72	Threads	192

The best performance VM setup running on these systems is a single VM will all the processors assigned, that is:

IBM POWER S824 2s/24c/192t POWER8 @3.52GHz with 24 VPs (= 1 VP/core x 24 cores).
Dell PowerEdge R730 2s/36c/72t Intel Xeon E5-2699 v3 @2.30 GHz with 72 vCPUs (= 1 vCPU/thread x 2 thread/core x 36 cores).

And the final results would be:, without taking into the reduction of capacity due to virtualization:

Dell PowerEdge R730 2s/36c/72t Intel Xeon E5-2699 v3 @2.30 GHz	Physical System	IBM POWER S824 2s/24c/192t POWER8 @3.52GHz
36	Cores	24
72	Threads	192
1	VM	1
72	vCPU / VP	24
90120	SAPS	115870
1250	SAPS/vCPU	4828

The dramatic difference, 4282 SAPS/VP vs 1250 SAPS/vCPU, would be even greater taking into account virtualization effects, as is widely known that POWERVM is more efficient than ESX. We may consider a 3-5% reduction for POWERVM and a 10-15% for ESX.

If the Intel Xeon Hyperthreading is switched off (HT=0ff), and this seldom happens in a virtualized environment, the capacity numbers would improve for Intel vCPUs to 2019 SAPS per vCPU ( = 2019 SAPS/core x 1 core/thread x 1 thread/vCPU ), again without taking into account virtualization overheads.

Summarizing

Which capacity should be assigned to vCPUs? I would take the above estimated values, representing best cases in benchmark conditions, reduced between 3-5% for POWERVM and 10-15% for ESX. This results in this approximate and simple relationship:

1 POWER8 VP ≈ 4 Xeon Haswell-EP vCPU (HT=On)

divendres, 8 d’agost del 2014

Demystifying Virtual Capacity Mini Papers

The IBM Technology Leadership Council from Brazil has translated to Portuguese and published two brief articles I have written on virtual capacity. Here are the public links:

https://www.ibm.com/developerworks/community/blogs/tlcbr/resource/mp/TLC-BR_Mini_Paper_Ano_9_N_213_VirtualCapacityPartI.pdf

https://www.ibm.com/developerworks/community/blogs/tlcbr/resource/mp/TLC-BR_Mini_Paper_Ano_9_N_214_VirtualCapacityPartII.pdf

As some of you –like me- might not be able to fully understand Portuguese, I expose below the English version for both mini papers.

Demystifying Virtual Capacity, Part I

In a virtualized system virtual machines share the underlying physical machine (PM). This physical machine has a well-defined processing capacity. Let us call virtual capacity to the actual capacity used by a virtual machine.

How much of the real physical capacity goes to a particular virtual machine (VM)? What are the parameters that determine the virtual capacity? In this paper we will briefly review these factors. Different hypervisors and virtualization technologies use different names for them, but in all the cases the underlying concepts are largely the same.

· Physical Machine capacity. The virtualization layer distributes the PM capacity among its VMs. Of course, this PM capacity is the absolute limit of the virtual capacity: a hosted VM cannot be larger than the hosting PM. Virtualization overhead should also be taken into account. The capacity is typically measured in processor cores or aggregated CPU cycles (MHz).

· Guaranteed capacity. This is the capacity the VM will have for sure when/if demanded. In example, consider a guaranteed capacity of 4 cores. If the VM workload demand is 2 cores, the VM will use 2 cores. But if the demand is 5 cores, it will use 4 for sure and the remaining 1 core may be used or not depending on additional factors. It is also called entitlement and reservation. It’s measured in capacity units.

· Exclusive use attribute. Flag or binary value indicating if the guaranteed capacity is set aside for the exclusive use of the VM. If it is not, the unused guaranteed capacity is available for use by the rest of VMs sharing the PM. Other name for this parameter is dedicated use.

· Limit / cap attribute. Flag or binary value that indicates if the guaranteed capacity can be exceeded or not. May the VM virtual capacity go beyond the guaranteed capacity in case of need? The response is yes if unlimited / uncapped, and no if limited / capped. Some hypervisors specify a limit capacity not tied to the guaranteed capacity.

· Virtual cores. A fundamental, but sometimes tricky concept, and the link between the physical and the virtual world. The operative system inside the VM sees (virtual) cores and dispatches runnable processes to them. Then the hypervisor dispatches virtual cores to physical cores. The number of virtual cores may limit the virtual capacity: i.e. a VM with 2 virtual cores never can have a virtual capacity greater than 2 physical cores.

· Relative priority. This parameter sets relative priorities between VMs that compete for capacity. This competence may happen when the sum of demands are greater than the PM capacity. You can find this concept under several names like uncapped weight and shares.

The actual virtual capacity depends on all the above factors. Let's go from theory to practice, and play with a simple scenario: two VMs, the red and the blue, sharing an 8 core PM.

Red and blue virtual machines are both defined in the same way:

· Guaranteed capacity: 4 cores.

· Exclusive use: NO.

· Uncapped / Unlimited.

· Virtual cores: 8.

· Relative priority: 128.

What do you think it would happen when red customers place a demand on the red VM of 5 cores, and at the same time blue customers place a demand on the blue VM of 5 cores?

According to the above parameterization, it's possible for the red VM to use 5 physical cores because it's uncapped/unlimited and it has at least 5 virtual cores. But to be able to go beyond its guarantee (4 cores), there must be available free physical capacity. And this is not the case, because the other VM -the blue VM- is using its 4 cores of guaranteed capacity.

So the final capacity distribution in the above conditions is: the red VM is using 4 cores, and the blue VM is using 4 cores. Consequently the physical machine is 100% busy (all 8 cores used). You may conclude, from the sizer point of view, that the physical machine is undersized, not being able to meet all demands placed on it.

What would happen if the blue demand decreases from 5 to 1 core? In the forthcoming second part we will solve this and will play with more complex and subtle cases.

Demystifying Virtual Capacity, Part II

In the previous mini paper we defined the virtual capacity concept and identified the generic factors it depends on: physical machine (PM) capacity, guaranteed capacity, exclusive use attribute, limit / cap attribute and relative priority.

A very simple thought scenario was proposed: an 8-core PM with two virtual machines (VM), the red and the blue, parameterized with the following configuration:

VM	Guarant. Capacity	vCores	Exclusive Use	Limit / Cap	Priority
Red	4	8	No	No	128
Blue	4	8	No	No	128

If the red demand is 5 cores and the blue demand is 1 core, what would be the resulting capacity distribution?

The blue will get 1 core, as this demand is well below its guaranteed 4 cores. The remaining 3 cores are not used and, as it has not the exclusive use attribute set, they are ceded back as free capacity.

The red usage increase to 5 cores, 4 of them coming from its guaranteed capacity and the additional 1 core coming from the available free capacity. Now the PM is 75% busy (6 of 8 cores) and there are no unsatisfied demands.

A general rule is that when all the VMs are uncapped and without exclusivity, and the sum of all demands are less than the PM capacity all VM demands can be satisfied.

Let’s consider a case with competence, that is, the PM capacity is not enough to satisfy the sum of VM demands (being all of them uncapped and with no exclusive use). How the scarce PM capacity is distributed?

Suppose the same 8-core PM with 3 VMs, the red, the blue and the green, with the following configuration:

VM	Guarant. Capacity	vCores	Exclusive Use	Limit / Cap	Priority
Red	3	8	No	No	128
Blue	3	8	No	No	128
Green	2	8	No	No	128

The demands are: red and blue 4 cores, and green 1 core. The sum of demands are 9 cores, more than the physical capacity (8 cores).

First group, VMs with demands less or equal than their guarantees. They are serviced and the rest of the guarantee is ceded and increase the free capacity. The green VM is in this group: it uses 1 core and cedes 1 core.

Second group, VMs with demands higher than their guarantees. They receive their guarantee plus the right proportion, according to their priorities, of the free capacity. The red and the blue are in this group, they both use 3 cores (the guaranteed) plus 0.5 cores, coming from the 1 core divided into two equal parts, as both VMs have the same priority and thus receive the same fraction. Is this situation, the PM is 100% busy and there are unsatisfied demands.

What would happen if the green VM is shutdown or its demand drops to zero? Or if the blue VM is limited/capped? Or if the red VM number of virtual cores is set to 2? Or if the green demand goes up to 4 cores?

Calculations to solve the generic case, if you know what to do and how to do them, are simple. I’ve created a spreadsheet implementing them. Simple, functional (but not fool proof): the virtual capacity demystifier. It comes with a companion presentation illustrating its usage. You should experiment with the tool to fully comprehend and understand virtual capacities.

A final word: perhaps I should have added ths following subtitle: “...in a perfect world”. In the real world there are second order effects -overheads, inefficiencies, cache misses,...- that diminish virtual capacities we have obtained. These effects belong to the realm of advanced practitioners and performance gurus, but you should be aware of them.