vSphere High Performance Cookbook

vSphere High Performance Cookbook

Overview of this book

VMware vSphere is the key virtualization technology in today's market. vSphere is a complex tool and incorrect design and deployment can create performance-related problems. vSphere High Performance Cookbook is focused on solving those problems as well as providing best practices and performance-enhancing techniques. vSphere High Performance Cookbook offers a comprehensive understanding of the different components of vSphere and the interaction of these components with the physical layer which includes the CPU, memory, network, and storage. If you want to improve or troubleshoot vSphere performance then this book is for you! vSphere High Performance Cookbook will teach you how to tune and grow a VMware vSphere 5 infrastructure. This book focuses on tuning, optimizing, and scaling the infrastructure using the vSphere Client graphical user interface. This book will enable the reader with the knowledge, skills, and abilities to build and run a high-performing VMware vSphere virtual infrastructure. You will learn how to configure and manage ESXi CPU, memory, networking, and storage for sophisticated, enterprise-scale environments. You will also learn how to manage changes to the vSphere environment and optimize the performance of all vSphere components. This book also focuses on high value and often overlooked performance-related topics such as NUMA Aware CPU Scheduler, VMM Scheduler, Core Sharing, the Virtual Memory Reclamation technique, Checksum offloading, VM DirectPath I/O, queuing on storage array, command queuing, vCenter Server design, and virtual machine and application tuning. By the end of this book you will be able to identify, diagnose, and troubleshoot operational faults and critical performance issues in vSphere.

vSphere High Performance Cookbook

Credits

About the Author

About the Reviewers

www.PacktPub.com

Preface

Free Chapter

CPU Performance Design

Introduction

Critical performance consideration – VMM scheduler

CPU scheduler – processor topology/cache aware

Ready time – warning sign

Hyperthreaded core sharing

Spotting CPU overcommitment

Fighting guest CPU saturation in SMP VMs

Controlling CPU resources using resource settings

What is most important to monitor in CPU performance

CPU performance best practices

Memory Performance Design

Introduction

Virtual memory reclamation techniques

Monitoring host-swapping activity

Monitoring host-ballooning activity

Keeping memory free for VMkernel

Key memory performance metrics to monitor

What metrics not to use

Identifying when memory is the problem

Analyzing host and VM memory

Memory performance best practices

Networking Performance Design

Introduction

Designing a network for load balancing and failover for vSphere Standard Switch

Designing a network for load balancing and failover for vSphere Distributed Switch

What to know when offloading checksum

Selecting the correct virtual network adapter

Improving performance through VMDirectPath I/O

Improving performance through NetQueue

Improving network performance using the SplitRx mode for multicast traffic

Designing a multi-NIC vMotion

Improving network performance using network I/O control

Monitoring network capacity and performance matrix

DRS, SDRS, and Resource Control Design

Introduction

Using DRS algorithm guidelines

Using resource pool guidelines

Avoiding using resource pool as folder structure

Choosing the best SIOC latency threshold

Using storage capability and profile driven storage

Anti-affinity rules in the SDRS cluster

Avoiding the use of SDRS I/O Metric and array-based automatic tiering together

Using VMware SIOC and array-based automatic tiering together

vSphere Cluster Design

Introduction

Trade-off factors while designing scale up and scale out clusters

Using VM Monitoring

vSphere Fault Tolerance design and its impact

DPM and its impact

Choosing the reserved cluster failover capacity

Rightly choosing the vSphere HA cluster size

Storage Performance Design

Introduction

Designing the host for a highly available and high-performing storage

Designing a highly available and high-performance iSCSI SAN

Designing a highly available and high-performing FC storage

Performance impact of queuing on the storage array and host

Factors that affect storage performance

Using VAAI to boost storage performance

Selecting the right VM disk type

Monitoring command queuing

Identifying a severely overloaded storage

Designing vCenter and vCenter Database for Best Performance

Introduction

vCenter Single Sign-On and its database preparation

vCenter Single Sign-On and its deployment

Things to bear in mind while designing the vCenter platform

Designing vCenter Server for redundancy

Designing a highly available vCenter database

vCenter database size and location affects performance

Considering vCenter Server Certificates to minimize security threats

Designing vCenter Server for Auto Deploy

Virtual Machine and Application Performance Design

Introduction

Setting the right time in Guest OS

vNUMA (Virtual NUMA) considerations

Choosing the SCSI controller for storage

Impact of VM swap file placement

Using large pages in virtual machines

Guest OS networking considerations

When you should or should not virtualize an application

Measuring the application's performance

Index

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Spotting CPU overcommitment

When we provision the CPU resources, which is the number of vCPUs allocated to running the virtual machines and that is greater than the number of physical cores on a host, is called CPU overcommitment.

CPU overcommitment is a normal practice in many situations; however, you need to monitor it closely. It increases the consolidation ratio.

CPU overcommitment is not recommended in order to satisfy or guarantee the workload of a tier-1 application with a tight SLA. CPU overcommitment may be successfully leveraged to highly consolidate and reduce the power consumption of light workloads on modern, multi-core systems.

Getting ready

To step through this recipe, you need a running ESXi Server, a couple of running CPU-hungry virtual machines, a SSH client (Putty), vCenter Server, and a working installation of vSphere Client. No other prerequisites are required.

The following table elaborates on Esxtop CPU Performance Metrics:

Esxtop Metric	Description	Implication
%RDY	Percentage of time a vCPU in a run queue is waiting for the CPU scheduler to let it run on a physical CPU.	A high %RDY time (use 20 percent as a starting point) may indicate the virtual machine is under resource contention. Monitor this; if the application speed is ok, a higher threshold may be tolerated.
%USED	Percentage of possible CPU processing cycles which were actually used for work during this time interval.	The %USED value alone does not necessarily indicate that the CPUs are overcommitted. However high %RDY values, plus high %USED values, are a sure indicator that your CPU resources are overcommitted.

How to do it...

To spot CPU overcommitment there are a few CPU resource parameters which you should monitor closely. Those are:

Log in to the ESXi Server through the SSH client.
Type esxtop and hit enter.
Monitor the preceding values to understand CPU overcommitment.

This example uses esxtop to detect CPU overcommitment. Looking at the pCPU line near the top of the screen, you can determine that this host's two CPUs are 100 percent utilized. Four active virtual machines are shown, Res-Hungry-1 to Res-Hungry-4. These virtual machines are active because they have relatively high values in the %USED column. The values in the %USED column alone do not necessarily indicate that the CPUs are overcommitted. In the %RDY column, you see that the three active virtual machines have relatively high values. High %RDY values, plus high %USED values, are a sure indicator that your CPU resources are overcommitted.

From the CPU view, navigate to a VM and press the E key to expand the view. It will give a detailed vCPU view for the VM. This is important because at a quick level, CPU ready as a metric is best referenced when looking at performance concerns more broadly than a specific VM. If there is high ready percentage noted, contention could be an issue, particularly if other VMs show high utilization when more vCPUs than physical cores are present. In that case, other VMs could be leading to high ready time on a low idle VM. So, long story short, if the CPU ready time is high on VMs on a host, it's time to verify that no other VMs are seeing performance issues.

You can also use vCenter performance chart to spot the CPU overcommitment, as follows:

Log in to the vCenter Server using vSphere Client.
On the home screen, navigate to Hosts and Clusters.
Go to the ESXi host.
Click on the Performance tab.
Navigate to the CPU from the Switch To drop-down menu on the right-hand side.
Navigate to the Advanced tab and click on the Chart Options.
Navigate to the ESXi host in the Objects section.
Select only Used and Ready in the Counters section and click on OK.

Now you will see the ready time and the used time in the graph and you can spot the overcommitment. The following screenshot is an example output:

The following example shows that the host has high used time.

How it works...

Although high ready time typically signifies a CPU contention, the condition does not always warrant corrective action. If the value for ready time is also accompanied by high used time then it might signify that the host is overcommitted.

So used time and ready time for an host might signal contention. However, the host might not be over-committed, due to workload availability.

There might be periods of activity and periods that are idle. So the CPU is not over-committed all the time. Another very common source of high ready time for VMs, even when pCPU utilization is low, is due to storage being slow. A vCPU, which occupies a pCPU, can issue a storage I/O and then sits in the WAIT state on the pCPU blocking other vCPUs. Other vCPUs accumulate ready time; this vCPU and this pCPU accumulate wait time (which is not a part of the used or utilized time).

vSphere High Performance Cookbook

vSphere High Performance Cookbook

Overview of this book

Related Content you might be interested in

Current Title:

vSphere High Performance Cookbook

Spotting CPU overcommitment

Getting ready

How to do it...

How it works...