Book Image

vSphere High Performance Cookbook

Book Image

vSphere High Performance Cookbook

Overview of this book

VMware vSphere is the key virtualization technology in today's market. vSphere is a complex tool and incorrect design and deployment can create performance-related problems. vSphere High Performance Cookbook is focused on solving those problems as well as providing best practices and performance-enhancing techniques. vSphere High Performance Cookbook offers a comprehensive understanding of the different components of vSphere and the interaction of these components with the physical layer which includes the CPU, memory, network, and storage. If you want to improve or troubleshoot vSphere performance then this book is for you! vSphere High Performance Cookbook will teach you how to tune and grow a VMware vSphere 5 infrastructure. This book focuses on tuning, optimizing, and scaling the infrastructure using the vSphere Client graphical user interface. This book will enable the reader with the knowledge, skills, and abilities to build and run a high-performing VMware vSphere virtual infrastructure. You will learn how to configure and manage ESXi CPU, memory, networking, and storage for sophisticated, enterprise-scale environments. You will also learn how to manage changes to the vSphere environment and optimize the performance of all vSphere components. This book also focuses on high value and often overlooked performance-related topics such as NUMA Aware CPU Scheduler, VMM Scheduler, Core Sharing, the Virtual Memory Reclamation technique, Checksum offloading, VM DirectPath I/O, queuing on storage array, command queuing, vCenter Server design, and virtual machine and application tuning. By the end of this book you will be able to identify, diagnose, and troubleshoot operational faults and critical performance issues in vSphere.
Table of Contents (15 chapters)
vSphere High Performance Cookbook
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
Index

Spotting CPU overcommitment


When we provision the CPU resources, which is the number of vCPUs allocated to running the virtual machines and that is greater than the number of physical cores on a host, is called CPU overcommitment.

CPU overcommitment is a normal practice in many situations; however, you need to monitor it closely. It increases the consolidation ratio.

CPU overcommitment is not recommended in order to satisfy or guarantee the workload of a tier-1 application with a tight SLA. CPU overcommitment may be successfully leveraged to highly consolidate and reduce the power consumption of light workloads on modern, multi-core systems.

Getting ready

To step through this recipe, you need a running ESXi Server, a couple of running CPU-hungry virtual machines, a SSH client (Putty), vCenter Server, and a working installation of vSphere Client. No other prerequisites are required.

The following table elaborates on Esxtop CPU Performance Metrics:

Esxtop Metric

Description

Implication

%RDY

Percentage of time a vCPU in a run queue is waiting for the CPU scheduler to let it run on a physical CPU.

A high %RDY time (use 20 percent as a starting point) may indicate the virtual machine is under resource contention. Monitor this; if the application speed is ok, a higher threshold may be tolerated.

%USED

Percentage of possible CPU processing cycles which were actually used for work during this time interval.

The %USED value alone does not necessarily indicate that the CPUs are overcommitted. However high %RDY values, plus high %USED values, are a sure indicator that your CPU resources are overcommitted.

How to do it...

To spot CPU overcommitment there are a few CPU resource parameters which you should monitor closely. Those are:

  1. Log in to the ESXi Server through the SSH client.

  2. Type esxtop and hit enter.

  3. Monitor the preceding values to understand CPU overcommitment.

This example uses esxtop to detect CPU overcommitment. Looking at the pCPU line near the top of the screen, you can determine that this host's two CPUs are 100 percent utilized. Four active virtual machines are shown, Res-Hungry-1 to Res-Hungry-4. These virtual machines are active because they have relatively high values in the %USED column. The values in the %USED column alone do not necessarily indicate that the CPUs are overcommitted. In the %RDY column, you see that the three active virtual machines have relatively high values. High %RDY values, plus high %USED values, are a sure indicator that your CPU resources are overcommitted.

From the CPU view, navigate to a VM and press the E key to expand the view. It will give a detailed vCPU view for the VM. This is important because at a quick level, CPU ready as a metric is best referenced when looking at performance concerns more broadly than a specific VM. If there is high ready percentage noted, contention could be an issue, particularly if other VMs show high utilization when more vCPUs than physical cores are present. In that case, other VMs could be leading to high ready time on a low idle VM. So, long story short, if the CPU ready time is high on VMs on a host, it's time to verify that no other VMs are seeing performance issues.

You can also use vCenter performance chart to spot the CPU overcommitment, as follows:

  1. Log in to the vCenter Server using vSphere Client.

  2. On the home screen, navigate to Hosts and Clusters.

  3. Go to the ESXi host.

  4. Click on the Performance tab.

  5. Navigate to the CPU from the Switch To drop-down menu on the right-hand side.

  6. Navigate to the Advanced tab and click on the Chart Options.

  7. Navigate to the ESXi host in the Objects section.

  8. Select only Used and Ready in the Counters section and click on OK.

Now you will see the ready time and the used time in the graph and you can spot the overcommitment. The following screenshot is an example output:

The following example shows that the host has high used time.

How it works...

Although high ready time typically signifies a CPU contention, the condition does not always warrant corrective action. If the value for ready time is also accompanied by high used time then it might signify that the host is overcommitted.

So used time and ready time for an host might signal contention. However, the host might not be over-committed, due to workload availability.

There might be periods of activity and periods that are idle. So the CPU is not over-committed all the time. Another very common source of high ready time for VMs, even when pCPU utilization is low, is due to storage being slow. A vCPU, which occupies a pCPU, can issue a storage I/O and then sits in the WAIT state on the pCPU blocking other vCPUs. Other vCPUs accumulate ready time; this vCPU and this pCPU accumulate wait time (which is not a part of the used or utilized time).