vSphere High Performance Cookbook - Second Edition

vSphere High Performance Cookbook - Second Edition - Second Edition

By : Kevin Elder, Christopher Kusek, Prasenjit Sarkar

Buy this Book

vSphere High Performance Cookbook - Second Edition - Second Edition

By: Kevin Elder, Christopher Kusek, Prasenjit Sarkar

Buy this Book

Overview of this book

vSphere is a mission-critical piece of software for many businesses. It is a complex tool, and incorrect design and deployment can create performance related issues that can negatively affect the business. This book is focused on solving these problems as well as providing best practices and performance-enhancing techniques. This edition is fully updated to include all the new features in version 6.5 as well as the latest tools and techniques to keep vSphere performing at its best. This book starts with interesting recipes, such as the interaction of vSphere 6.5 components with physical layers such as CPU, memory, and networking. Then we focus on DRS, resource control design, and vSphere cluster design. Next, you’ll learn about storage performance design and how it works with VMware vSphere 6.5. Moving on, you will learn about the two types of vCenter installation and the benefits of each. Lastly, the book covers performance tools that help you get the most out of your vSphere installation. By the end of this book, you will be able to identify, diagnose, and troubleshoot operational faults and critical performance issues in vSphere 6.5.

Title Page

Credits

About the Authors

About the Reviewer

www.PacktPub.com

Customer Feedback

Preface

Free Chapter

CPU Performance Design

Introduction

Critical performance consideration - VMM scheduler

CPU scheduler - processor topology/cache-aware

Ready time - warning sign

Spotting CPU overcommitment

Fighting guest CPU saturation in SMP VMs

Controlling CPU resources using resource settings

What is most important to monitor in CPU performance

CPU performance best practices

Memory Performance Design

Introduction

Virtual memory reclamation techniques

Monitoring a host-swapping activity

Monitoring a host-ballooning activity

Keeping memory free for VMkernel

Key memory performance counters to monitor

What counters not to use

Identifying when memory is the problem

Analyzing host and VM memory

Memory performance best practices

Introduction

Virtual memory reclamation techniques

Monitoring a host-swapping activity

Monitoring a host-ballooning activity

Keeping memory free for VMkernel

Key memory performance counters to monitor

What counters not to use

Identifying when memory is the problem

Analyzing host and VM memory

Memory performance best practices

Networking Performance Design

Introduction

Designing a vSphere Standard Switch for load balancing and failover

Designing a vSphere Distributed Switch for load balancing and failover

What to know when offloading checksum

Selecting the correct virtual network adapter

Improving performance through VMDirectPath I/O

Improving performance through NetQueue

Improving network performance using the SplitRx mode for multicast traffic

Designing a multi-NIC vMotion

Improving network performance using network I/O control

Monitoring network capacity and performance matrix

DRS, SDRS, and Resource Control Design

Introduction

Using DRS algorithm guidelines

Using resource pool guidelines

Avoiding the use of a resource pool as a folder structure

Choosing the best SIOC latency threshold

Using storage capability and policy-driven storage

Anti-affinity rules in the SDRS cluster

Avoiding the use of the SDRS I/O metric and array-based automatic tiering together

Using VMware SIOC and array-based automatic tiering together

vSphere Cluster Design

Introduction

Trade-off factors while designing scale-up and scale-out clusters

Using VM Monitoring

vSphere Fault Tolerance design and its impact

DPM and its impact

Choosing the reserved cluster failover capacity

Choosing the correct vSphere HA cluster size

Storage Performance Design

Introduction

Designing the host for a highly available and high-performance storage

Designing a highly available and high-performance iSCSI SAN

Designing a highly available and high-performance FC storage

Performance impact of queuing on the storage array and host

Factors that affect storage performance

Using VAAI or VASA to boost storage performance

Selecting the right VM disk type

Monitoring command queuing

Identifying a severely overloaded storage

Setting up VVols

Introduction to vSAN

Health check for vSAN

Designing vCenter on Windows for Best Performance

Introduction

Things to bear in mind while designing the vCenter platform

Deploying Platform Services Controller

Deploying the vCenter server components

Designing vCenter server for redundancy

Designing a highly available vCenter database

vCenter database size and location affects performance

Using vSphere 6.x Certificate Manager for certificates

Designing vCenter server for Auto Deploy

Designing VCSA for Best Performance

Introduction

Deploying Platform Services Controller

Deploying VCSA server components

Setting up vCenter Server High Availability

Adding VCSA to your Windows domain and adding users

Checking VCSA performance using vimtop

Checking VCSA performance using the GUI

Virtual Machine and Virtual Environment Performance Design

Introduction

Setting the right time in Guest OS

Virtual NUMA considerations

Choosing the SCSI controller for storage

Impact of VM swap file placement

Using large pages in VMs

Guest OS networking considerations

When you should or should not virtualize an application

Measuring the environment's performance

Performance Tools

Introduction

PowerCLI - introduction

Iometer

Customer Reviews

5 star

4 star

3 star

2 star

1 star

Ready time - warning sign

To achieve the best performance in a consolidated environment, you must consider a ready time.

Ready time is the time during which vCPU waits in the queue for pCPU (or physical core) to be ready to execute its instruction. The scheduler handles the queue and when there is contention and the processing resources are stressed, the queue might become long.

Ready time describes how much of the last observation period a specific world spent waiting in the queue. Ready time for a particular world (for example, a vCPU) indicates how much time during that interval was spent waiting in the queue to get access to a pCPU. It can be expressed in percentage per vCPU over observation time, and statistically, it can't be zero on average.

The value of ready time, therefore, is an indicator of how long a VM is denied access to the pCPU resources that it wanted to use. This makes it a good indicator of performance.

When multiple processes try to use the same physical CPU, that CPU might not be immediately available and a process must wait before the ESXi host can allocate a CPU to it.

The CPU scheduler manages access to the physical CPUs on the host system. A short spike in CPU used or CPU ready indicates that you are making the best use of the host resources. However, if both the values are constantly high, the hosts are probably overloaded and performance is likely poor.

Generally, if the CPU-used value of a VM is above 90 percent and the CPU-ready value is above 20 percent per vCPU (high number of vCPUs), performance is negatively affected.

This latency may impact the performance of the guest operating system and the running of applications within a VM.

Getting ready

To step through this recipe, you need a running ESXi Server, a couple of CPU-hungry VMs, a VMware vCenter Server, and vSphere Web Client. No other prerequisites are required.

How to do it...

Let's get started:

Open up vSphere Web Client.
On the home screen, navigate to Hosts and Clusters.
Expand the left-hand navigation list.
Navigate to one of the CPU-hungry VMs.
Navigate to the Monitor tab.
Navigate to the Performance tab.
Navigate to the Advanced view.
Click on Chart Options.
Navigate to CPU from Chart metrics.
Navigate to the VM object and only select Demand, Ready, and Usage in MHz.

Note

The key metrics when investigating a potential CPU issue are as follows:

Demand: The amount of CPU that the VM is trying to use.
Usage: The amount of CPU that the VM is actually being allowed to use.
Ready: The amount of time during which the VM is ready to run but (has work it wants to do) is unable to because vSphere could not find physical resources to run the VM on.

11. Click on Ok.

In the following screenshot, you will see the high ready time of the VM:

Notice the amount of CPU this VM is demanding and compare that to the amount of CPU usage the VM is actually being able to get (usage in MHz). The VM is demanding more than it is currently being allowed to use.

Notice that the VM is also seeing a large amount of ready time.

Note

Ready time greater than 10 percent could be a performance concern. However, some less CPU-sensitive applications and VMs can have much higher values of ready time and still perform satisfactorily.

How it works...

A vCPU is in a ready state when the vCPU is ready to run (that is, it has a task it wants to execute). But it is unable to run because the vSphere scheduler is unable to find physical host CPU resources to run the VM on. One potential reason for elevated ready time is that the VM is constrained by a user-set CPU limit or resource pool limit, reported as max limited (MLMTD). The amount of CPU denied because of a limit is measured as the metric max limited.

Ready time is reported in two different values between resxtop/esxtop and vCenter Server. In resxtop/esxtop, it is reported in an easily understood percentage format. A figure of 5 percent means that the VM spent 5 percent of its last sample period waiting for the available CPU resources (only true for 1-vCPU VMs). In vCenter Server, ready time is reported as a measurement of time. For example, in vCenter Server's real-time data, which produces sample values every 20,000 milliseconds, a figure of 1,000 milliseconds is reported for 5 percent ready time. A figure of 2,000 milliseconds is reported for 10 percent ready time.

Although high ready time typically signifies CPU contention, the condition does not always warrant corrective action. If the value of ready time is close in value of the amount of time used on the CPU and if the increased ready time occurs with occasional spikes in CPU activity but does not persist for extended periods of time, this might not indicate a performance problem. The brief performance hit is often within the accepted performance variance and does not require any action on the part of the administrator.

vSphere High Performance Cookbook - Second Edition - Second Edition

By : Kevin Elder, Christopher Kusek, Prasenjit Sarkar

vSphere High Performance Cookbook - Second Edition - Second Edition

By: Kevin Elder, Christopher Kusek, Prasenjit Sarkar

Overview of this book

Related Content you might be interested in

Current Title:

vSphere High Performance Cookbook - Second Edition - Second Edition

VMware vSphere 6.7 Cookbook

VMware vSphere 6.5 Cookbook

Mastering VMware vSphere 6.7,

Ready time - warning sign

Getting ready

How to do it...

Note

Note

How it works...