dilluns, 11 de gener del 2016

Don’t put in the same bag Xeon and POWER virtual CPUs

Just a reminder: this blog is a mirror of the "main" site https://www-304.ibm.com/connections/blogs/performance (http://ibm.biz/demystperf)



In the pervasive virtual world the standard unit of performance capacity happens to be the virtual CPU (vCPU). “This virtual machine (VM) has 6 vCPUs”, “you will have to provide 8 vCPUs for that VM”, and the like are common sentences. It could be a reasonable metric of performance if the underlying physical CPUs and the hypervisor layer were all the same. But this is seldom the case: Intel Xeon processors combined with VMware ESX virtualization, and IBM POWER processors combined with POWERVM virtualization are very different beasts.


If you are in need to size or convert capacity between these dissimilar systems, or in need to have a solid comparison base, or would like to unmask tricks and pitfalls that plague the virtual world sizing, continue reading.

IBM POWERVM



The POWERVM term for vCPU is Virtual Processor (VP). The VM has, or sees, VPs. And these VPs are scheduled, in a time-shared manner, on POWER cores. Yes, read it again: one VP is scheduled on one core. I remark this because in the ESX / Intel world this is different, as you will see later.


Given this VP-to-core mapping, the VP capacity ranges between two values:
  • In the best case the capacity of one VP is the capacity that one core can deliver, that is 1 VP is 1 core           
  • In the worst case 1 VP is (1/10)th of core.


The actual VP capacity depends on the following factors (revisit  “Why is the Virtual Capacity so Important?” and “The Playground of the Virtual Capacity”  in this blog for a detailed explanation):
  • configuration parameters of the VM the VP belongs to (entitlement, capped/uncapped attribute, uncapped weight)
  • configuration parameters of all the other VMs sharing the same physical machine (PM)
  • actual usage of capacity from all the other VMs sharing the same physical machine


Given this highly variable value, contrary to what one unit of measure must be, how have VPs been promoted to be a “standard” measure of capacity? Amazing, don’t you think so?


VMWARE ESX



The VM has, or sees, vCPUs. And those vCPUs are scheduled on Intel processor threads, in a time shared manner. The mapping is vCPU-to-thread, and is different than the POWER / POWERVM case (VP-to-core).


Given this vCPU-to-thread mapping, the vCPU capacity ranges between two values:
  • In the best case the capacity of one vCPU is the capacity that one thread can deliver, that is 1 vCPU is 1 thread           
  • In the worst case one vCPU is very small (I’m not aware of a low limit)


The actual vCPU capacity depends on the same factors described in the POWERVM case,
  • configuration parameters of the VM the VP belongs to
  • configuration parameters of all the other VMs sharing the same PM
  • actual usage of capacity from all the other VMs sharing the same PM.

Benchmarking vCPUs



The reputation of the vCPU as a stable capacity unit of measure has been destroyed. A vCPU capacity can range from a full core (or to a full thread) to a small fraction, and even depends on alien factors (from other VMs)!  


Is there a way to put some sense in this nihilism?


Yes, it is. By taking a practical approach: use the best case values. You know that the actual performance will always be equal or worse than that, but we have to live with this.


To evaluate the best case let’s consider the two systems we analyzed in SAPS Olympics: single thread performance post in this blog.


Dell PowerEdge R730 2s/36c/72t Intel Xeon  E5-2699 v3 @2.30 GHz

Physical System
IBM POWER S824 2s/24c/192t POWER8 @3.52GHz
36
Cores
24
72
Threads
192


The best performance VM setup running on these systems is a single VM will all the processors assigned, that is:
  • IBM POWER S824 2s/24c/192t POWER8 @3.52GHz with 24 VPs  (= 1 VP/core x 24 cores).
  • Dell PowerEdge R730 2s/36c/72t Intel Xeon  E5-2699 v3 @2.30 GHz with 72 vCPUs (= 1 vCPU/thread x 2 thread/core x 36 cores).


And the final results would be:, without taking into the reduction of capacity due to virtualization:


Dell PowerEdge R730 2s/36c/72t Intel Xeon  E5-2699 v3 @2.30 GHz

Physical System
IBM POWER S824 2s/24c/192t POWER8 @3.52GHz
36
Cores
24
72
Threads
192
1
VM
1
72
vCPU / VP
24
90120
SAPS
115870
1250
SAPS/vCPU
4828

The dramatic difference, 4282 SAPS/VP vs 1250 SAPS/vCPU, would be even greater taking into account virtualization effects, as is widely known that POWERVM is more efficient than ESX. We may consider a 3-5% reduction for POWERVM and a 10-15% for ESX.


If the Intel Xeon Hyperthreading is switched off (HT=0ff), and this seldom happens in a virtualized environment, the capacity numbers would improve for Intel vCPUs to 2019 SAPS per vCPU  ( = 2019 SAPS/core x 1 core/thread x 1 thread/vCPU ), again without taking into account virtualization overheads.

Summarizing



Which capacity should be assigned to vCPUs? I would take the above estimated values, representing best cases in benchmark conditions, reduced between 3-5% for POWERVM and 10-15% for ESX. This results in this approximate and simple relationship:

1 POWER8 VP4 Xeon Haswell-EP vCPU (HT=On)

Cap comentari:

Publica un comentari a l'entrada