AOS 5.0: Performance improvements in Nutanix AHV

Nutanix AHVIf you read my previous blog post AOS 5.0 released, XenServer TP in full effect! you know that we released AOS 5.0 and with that comes a new version of Nutanix AHV. This new release brings a couple of performance improvements specifically useful for the desktop virtualization use case so this blog will highlight a few of those improvements. 

 

As Nutanix AHV is a KVM (Kernel-based Virtual Machine) based hypervisor we’re able to control the stack ‘soup to nuts‘ which adds tons of flexibility to improve things like resource management. With this latest release, we did focus on CPU scheduling improvements, some of our key focus points where:

  • cGroups
  • USB Management
  • Hugepages
  • Time Stamp Counter
cGroups

CGroups or Control Groups are used for resource management to virtual machines, they were introduced to be able to control resources used by host-based processes. I could extend the lenght of this blogpost on a detailed description of cGroups and what we did but this video explains all of this so much better:

USB Management

The performance and solutions engineering team, together with the AHV development team, identified a potential performance bottleneck due to a bug in qemu and USB-tablet CPU consumption. The identified problem is over extensive CPU utilization by the USB host adapter that by design polls the USB devices.

This means that the Operating System will constantly check if the media on the USB device has been changed, this basically means that the software is stalking the hardware. According to this source the hardware design of the UHCI, OHCI and EHCI USB host adapters forces qemu to do poll when the host adapter is emulated. While this problem is already solved for Linux-based operating systems Windows still is affected by this problem which will impact your XenDesktop implementation.

To prevent over extensive CPU utilization this Nutanix created a fix for the newer releases of our Acropolis Operating System (AOS).

For VMs created on AOS 4.7.3 or earlier there’s an in-guest fix to apply to bring down CPU utilization and increase performance:

Deleting these two registry subtrees when it’s confirmed USB type 1 devices are in use:

 

Hugepages

Both Hugepages and Transparant Hugepages are a known item in the virtualization. The host memory is being divided into pages of 4KB which is the default size, these pages can be much larger. In fact, 2Mb pages (huge pages) can be used when the host is deployed with support for hugepages.

For the system to know where the information is stored there’s a page table but looking up information in this page table is a costly action. To prevent this time spend on a lookup there’s the TLB (Translation Lookaside Buffer), it’s a cache for those look ups.

Because modern day hardware is often equipped with tons of RAM, the TLB can be overflooded. If the TLB is not able to provide an answer the system will fall back to the original page table which will cause performance impact. Because of the larger page size, you will reduce the page table and lower the requests hitting TLB ensuring an optimal performance. 

Time Stamp Counter

The Time Stamp Counter and Real-Time Clock are being used on AHV and hosted Windows VMs (older kernel versions but also Windows 2008R2 and Windows 7). Emulating time can be very complex because different hardware can have different frequencies or being powered on at different times. The easy example here is that if you’re migrating a VM you want to have the VM running at the same time on the destination host as it did on the source host. All of this was fixed in our emulation code so there’s no fallback needed to qemu for time emulation.

The time keeping for these OS-es can add to the CPU resource consumption of the overall hosts, by enabling a Windows paravirtualization feature we can use TSC when migrating guests.  

This all resulted in some major improvements on our LoginVSI testing. Obviously this will be reflected in the next upcoming refresh of our Reference Architecture for XenDesktop on Nutanix AHV )validated with LoginVSI).

The following two tabs change content below.

Kees Baggerman

Kees Baggerman is a Staff Solutions Architect for End User Computing at Nutanix. Kees has driven numerous Microsoft and Citrix, and RES infrastructures functional/technical designs, migrations, implementations engagements over the years.

3 comments

  1. […] via AOS 5.0: Performance improvements in Nutanix AHV — My Virtual Vision […]

  2. […] AOS 5.0: Performance improvements in Nutanix AHV […]

  3. Suman says:

    I am trying to delete this registry key but unable to delete :

    HLM\SYSTEM\CurrentControlSet\Enum\USB it gives me error can not delete USB : Error while deleting Key. I already tried with Key ownership and full permission but no luck.

    However I am able to delete HLM\SYSTEM\CurrentControlSet\Control\usbflags

    Please help Thnaks in Advance

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.