vSphere High Performance Cookbook - Second Edition

vSphere High Performance Cookbook - Second Edition - Second Edition

By : Kevin Elder, Christopher Kusek, Prasenjit Sarkar

Buy this Book

vSphere High Performance Cookbook - Second Edition - Second Edition

By: Kevin Elder, Christopher Kusek, Prasenjit Sarkar

Buy this Book

Overview of this book

vSphere is a mission-critical piece of software for many businesses. It is a complex tool, and incorrect design and deployment can create performance related issues that can negatively affect the business. This book is focused on solving these problems as well as providing best practices and performance-enhancing techniques. This edition is fully updated to include all the new features in version 6.5 as well as the latest tools and techniques to keep vSphere performing at its best. This book starts with interesting recipes, such as the interaction of vSphere 6.5 components with physical layers such as CPU, memory, and networking. Then we focus on DRS, resource control design, and vSphere cluster design. Next, you’ll learn about storage performance design and how it works with VMware vSphere 6.5. Moving on, you will learn about the two types of vCenter installation and the benefits of each. Lastly, the book covers performance tools that help you get the most out of your vSphere installation. By the end of this book, you will be able to identify, diagnose, and troubleshoot operational faults and critical performance issues in vSphere 6.5.

Title Page

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

Free Chapter

CPU Performance Design

Introduction

Critical performance consideration - VMM scheduler

CPU scheduler - processor topology/cache-aware

Ready time - warning sign

Spotting CPU overcommitment

Fighting guest CPU saturation in SMP VMs

Controlling CPU resources using resource settings

What is most important to monitor in CPU performance

CPU performance best practices

Memory Performance Design

Introduction

Virtual memory reclamation techniques

Monitoring a host-swapping activity

Monitoring a host-ballooning activity

Keeping memory free for VMkernel

Key memory performance counters to monitor

What counters not to use

Identifying when memory is the problem

Analyzing host and VM memory

Memory performance best practices

Introduction

Virtual memory reclamation techniques

Monitoring a host-swapping activity

Monitoring a host-ballooning activity

Keeping memory free for VMkernel

Key memory performance counters to monitor

What counters not to use

Identifying when memory is the problem

Analyzing host and VM memory

Memory performance best practices

Networking Performance Design

Introduction

Designing a vSphere Standard Switch for load balancing and failover

Designing a vSphere Distributed Switch for load balancing and failover

What to know when offloading checksum

Selecting the correct virtual network adapter

Improving performance through VMDirectPath I/O

Improving performance through NetQueue

Improving network performance using the SplitRx mode for multicast traffic

Designing a multi-NIC vMotion

Improving network performance using network I/O control

Monitoring network capacity and performance matrix

DRS, SDRS, and Resource Control Design

Introduction

Using DRS algorithm guidelines

Using resource pool guidelines

Avoiding the use of a resource pool as a folder structure

Choosing the best SIOC latency threshold

Using storage capability and policy-driven storage

Anti-affinity rules in the SDRS cluster

Avoiding the use of the SDRS I/O metric and array-based automatic tiering together

Using VMware SIOC and array-based automatic tiering together

vSphere Cluster Design

Introduction

Trade-off factors while designing scale-up and scale-out clusters

Using VM Monitoring

vSphere Fault Tolerance design and its impact

DPM and its impact

Choosing the reserved cluster failover capacity

Choosing the correct vSphere HA cluster size

Storage Performance Design

Introduction

Designing the host for a highly available and high-performance storage

Designing a highly available and high-performance iSCSI SAN

Designing a highly available and high-performance FC storage

Performance impact of queuing on the storage array and host

Factors that affect storage performance

Using VAAI or VASA to boost storage performance

Selecting the right VM disk type

Monitoring command queuing

Identifying a severely overloaded storage

Setting up VVols

Introduction to vSAN

Health check for vSAN

Designing vCenter on Windows for Best Performance

Introduction

Things to bear in mind while designing the vCenter platform

Deploying Platform Services Controller

Deploying the vCenter server components

Designing vCenter server for redundancy

Designing a highly available vCenter database

vCenter database size and location affects performance

Using vSphere 6.x Certificate Manager for certificates

Designing vCenter server for Auto Deploy

Designing VCSA for Best Performance

Introduction

Deploying Platform Services Controller

Deploying VCSA server components

Setting up vCenter Server High Availability

Adding VCSA to your Windows domain and adding users

Checking VCSA performance using vimtop

Checking VCSA performance using the GUI

Virtual Machine and Virtual Environment Performance Design

Introduction

Setting the right time in Guest OS

Virtual NUMA considerations

Choosing the SCSI controller for storage

Impact of VM swap file placement

Using large pages in VMs

Guest OS networking considerations

When you should or should not virtualize an application

Measuring the environment's performance

Performance Tools

Introduction

PowerCLI - introduction

Iometer

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Spotting CPU overcommitment

When we provision CPU resources, which is the number of vCPUs allocated to run the VMs, and if its number is greater than the number of physical cores on a host, it is called CPU overcommitment. CPU overcommitment is a normal practice in many situations as it increases the consolidation ratio. Nevertheless, you need to monitor it closely.

CPU overcommitment is not recommended in order to satisfy or guarantee the workload of a tier-1 application with a tight SLA. It may be successfully leveraged to highly consolidate and reduce the power consumption of light workloads on modern, multicore systems.

Getting ready

To step through this recipe, you need a running ESXi Server with SSH enabled, a couple of running CPU-hungry VMs, an SSH client (Putty), a vCenter Server, and vSphere Web Client. No other prerequisites are required.

The following table elaborates on Esxtop CPU Performance Metrics:

Esxtop Metric	Description	Implication
`%RDY`	The percentage of time a vCPU in a run queue is waiting for the CPU scheduler to let it run on a physical CPU.	A high `%RDY` time (use 20 percent as the starting point) may indicate the VM is under resource contention. Monitor this; if the application speed is OK, a higher threshold may be tolerated.
`%USED`	The percentage of possible CPU processing cycles that were actually used for work during this time interval.	The `%USED` value alone does not necessarily indicate that the CPUs are overcommitted. However, high `%RDY` values plus high `%USED` values are a sure indicator that your CPU resources are overcommitted.

How to do it...

To spot CPU overcommitment, there are a few CPU resource parameters that you should monitor closely. They are:

Log in to the ESXi Server using an SSH client (Putty).
Type esxtop and hit Enter.

Monitor the preceding values to understand CPU overcommitment.

This example uses esxtop to detect CPU overcommitment. Looking at the pCPU line near the top of the screen, you can determine that this host's two CPUs are 100 percent utilized. Four active VMs are shown, Res-Hungry-1 to Res-Hungry-4. These VMs are active because they have relatively high values in the %USED column. The values in the %USED column alone do not necessarily indicate that the CPUs are overcommitted. In the %RDY column, you see that the three active VMs have relatively high values. High %RDY values plus high %USED values are a sure indicator that your CPU resources are overcommitted.

From the CPU view, navigate to a VM and press the E key to expand the view. It will give a detailed vCPU view for the VM. This is important because, at a quick level, CPU that is ready as a metric is best referenced when looking at performance concerns more broadly than a specific VM. If there is high ready percentage noted, contention could be an issue, particularly if other VMs show high utilization when more vCPUs than physical cores are present. In that case, other VMs could lead to high ready time on a low idle VM. So, long story short, if the CPU ready time is high on VMs on a host, it's time to verify that no other VMs are seeing performance issues.

You can also use the vCenter performance chart to spot CPU overcommitment, as follows:

Log in to the vCenter Server using vSphere Web Client.
On the home screen, navigate to Hosts and Clusters.
Expand the left-hand navigation list.
Navigate to one of the ESXi hosts.
Navigate to the Monitor tab.
Navigate to the Performance tab.
Navigate to the Advanced view.
Click on Chart Options.
Navigate to CPU from Chart metrics.
Select only Used and Ready in the Counters section and click on OK:

Now you will see the ready time and the used time in the graph and you can spot the overcommitment. The following screenshot is an example output:

The following example shows that the host has high ready time:

How it works...

Although high ready time typically signifies CPU contention, the condition does not always warrant corrective action. If the value of ready time is also accompanied by high used time, then it might signify that the host is overcommitted.

So the used time and ready time of a host might signal contention. However, the host might not be overcommitted due to workload availability.

There might be periods of activity and periods that are idle. So the CPU is not over-committed all the time. Another very common source of high ready time for VMs, even when pCPU utilization is low, is due to storage being slow. A vCPU, which occupies a pCPU, can issue a storage I/O and then sit in the WAIT state on the pCPU, blocking other vCPUs. Other vCPUs accumulate ready time; this vCPU and pCPU accumulate wait time (which is not part of the used or utilized time).

vSphere High Performance Cookbook - Second Edition - Second Edition

By : Kevin Elder, Christopher Kusek, Prasenjit Sarkar

vSphere High Performance Cookbook - Second Edition - Second Edition

By: Kevin Elder, Christopher Kusek, Prasenjit Sarkar

Overview of this book

Related Content you might be interested in

Current Title:

vSphere High Performance Cookbook - Second Edition - Second Edition

VMware vSphere 6.7 Cookbook

VMware vSphere 6.5 Cookbook

Mastering VMware vSphere 6.7,

Spotting CPU overcommitment

Getting ready

How to do it...

How it works...