-
Book Overview & Buying
-
Table Of Contents
Vulkan 3D Graphics Rendering Cookbook - Second Edition
By :
In the previous recipes, we learned how to create a Vulkan instance, a device for rendering, and a swapchain. In this recipe, we will learn how to manage command buffers and submit them using command queues which will bring us a bit closer to rendering our first image with Vulkan.
Vulkan command buffers are used to record Vulkan commands which can be then submitted to a device queue for execution. Command buffers are allocated from pools which allow the Vulkan implementation to amortize the cost of resource creation across multiple command buffers. Command pools are be externally synchronized which means one command pool should not be used between multiple threads. Let’s learn how to make a convenient user-friendly wrapper on top of Vulkan command buffers and pools.
We are going to explore the command buffers management code from the LightweightVK library. Take a look at the class VulkanImmediateCommands from lvk/vulkan/VulkanClasses.h. In the previous edition of our book, we used very rudimentary command buffers management code which did not suppose any synchronization because every frame was “synchronized” with vkDeviceWaitIdle(). Here we are going to explore a more pragmatic solution with some facilities for synchronization.
Let’s go back to our demo application from the recipe Initializing Vulkan swapchain which renders a black empty window Chapter02/01_Swapchain. The main loop of the application looks as follows:
while (!glfwWindowShouldClose(window)) {
glfwPollEvents();
glfwGetFramebufferSize(window, &width, &height);
if (!width || !height) continue;
lvk::ICommandBuffer& buf = ctx->acquireCommandBuffer();
ctx->submit(buf, ctx->getCurrentSwapchainTexture());
}
Here we acquire a next command buffer and then submit it without writhing any commands into it so that LightweightVK can run its swapchain presentation code and render a black window. Let’s dive deep into the implementation and learn how lvk::VulkanImmediateCommands does all the heavy lifting behind the scenes.
SubmitHandle, to identify previously submitted command buffers. It will be essential for implementing synchronization when scheduling work that depends on the results of a previously submitted command buffer. The struct includes an internal index for the submitted buffer and an integer ID for the submission. For convenience, handles can be converted to and from 64-bit integers.
struct SubmitHandle {
uint32_t bufferIndex_ = 0;
uint32_t submitId_ = 0;
SubmitHandle() = default;
explicit SubmitHandle(uint64_t handle) :
bufferIndex_(uint32_t(handle & 0xffffffff)),
submitId_(uint32_t(handle >> 32)) {}
bool empty() const { return submitId_ == 0; }
uint64_t handle() const
{ return (uint64_t(submitId_) << 32) + bufferIndex_; }
};
CommandBufferWrapper, is needed to encapsulate all Vulkan objects associated with a single Vulkan command buffer. This struct stores the originally allocated and currently active command buffers, the most recent SubmitHandle linked to the command buffer, a Vulkan fence, and a Vulkan semaphore. The fence is used for GPU-CPU synchronization, while the semaphore ensures that command buffers are processed by the GPU in the order they were submitted. This sequential processing, enforced by LightweightVK, simplifies many aspects of rendering.
struct CommandBufferWrapper {
VkCommandBuffer cmdBuf_ = VK_NULL_HANDLE;
VkCommandBuffer cmdBufAllocated_ = VK_NULL_HANDLE;
SubmitHandle handle_ = {};
VkFence fence_ = VK_NULL_HANDLE;
VkSemaphore semaphore_ = VK_NULL_HANDLE;
bool isEncoding_ = false;
};
Now let’s take a look at the interface of lvk::VulkanImmediateCommands.
kMaxCommandBuffers. If all buffers are in use, VulkanImmediateCommands waits for an existing command buffer to become available by waiting on a fence. Typically, 64 command buffers are sufficient to ensure non-blocking operation in most cases. The constructor takes a queueFamilyIdx parameter to retrieve the appropriate Vulkan queue.
class VulkanImmediateCommands final {
public:
static constexpr uint32_t kMaxCommandBuffers = 64;
VulkanImmediateCommands(VkDevice device,
uint32_t queueFamilyIdx, const char* debugName);
~VulkanImmediateCommands();
acquire() method returns a reference to the next available command buffer. If all command buffers are in use, it waits on a fence until one becomes available. The submit() method submits a command buffer to the assigned Vulkan queue.
const CommandBufferWrapper& acquire();
SubmitHandle submit(const CommandBufferWrapper& wrapper);
waitSemaphore() method ensures the current command buffer waits on a given semaphore before execution. A common use case is using an “acquire semaphore” from our VulkanSwapchain object, which signals a semaphore when acquiring a swapchain image, ensuring the command buffer waits for it before starting to render into the swapchain image. The signalSemaphore() method signals a corresponding Vulkan timeline semaphore when the current command buffer finishes execution. The acquireLastSubmitSemaphore() method retrieves the semaphore signaled when the last submitted command buffer completes. This semaphore can be used by the swapchain before presentation to ensure that rendering into the image is complete. We’ll take a closer look at how this works in a moment.
void waitSemaphore(VkSemaphore semaphore);
void signalSemaphore(VkSemaphore semaphore, uint64_t signalValue);
VkSemaphore acquireLastSubmitSemaphore();
SubmitHandle getLastSubmitHandle() const;
bool isReady(SubmitHandle handle) const;
void wait(SubmitHandle handle);
void waitAll();
CommandBufferWrapper objects called buffers_[].
private:
void purge();
VkDevice device_ = VK_NULL_HANDLE;
VkQueue queue_ = VK_NULL_HANDLE;
VkCommandPool commandPool_ = VK_NULL_HANDLE;
uint32_t queueFamilyIndex_ = 0;
const char* debugName_ = "";
CommandBufferWrapper buffers_[kMaxCommandBuffers];
VkSemaphoreSubmitInfo structures are preinitialized with generic stageMask values. For submitting Vulkan command buffers, we use the function vkQueueSubmit2() introduced in Vulkan 1.3, which requires pointers to these structures.
VkSemaphoreSubmitInfo lastSubmitSemaphore_ = {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO,
.stageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT};
VkSemaphoreSubmitInfo waitSemaphore_ = {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO,
.stageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT};
VkSemaphoreSubmitInfo signalSemaphore_ = {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO,
.stageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT};
uint32_t numAvailableCommandBuffers_ = kMaxCommandBuffers;
uint32_t submitCounter_ = 1;
};
The VulkanImmediateCommands class is central to the entire operation of our Vulkan backend. Let’s dive into its implementation, examining each method in detail.
Let’s begin with the class constructor and destructor. The constructor preallocates all command buffers. For simplicity, error checking and debugging code will be omitted here; please refer to the LightweightVK library source code for full error-checking details.
VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT flag is used to specify that any command buffers allocated from this pool can be individually reset to their initial state using the Vulkan function vkResetCommandBuffer(). To indicate that command buffers allocated from this pool will have a short lifespan, we use the VK_COMMAND_POOL_CREATE_TRANSIENT_BIT flag, meaning they will be reset or freed within a relatively short timeframe.
lvk::VulkanImmediateCommands::VulkanImmediateCommands(
VkDevice device,
uint32_t queueFamilyIndex, const char* debugName) :
device_(device), queueFamilyIndex_(queueFamilyIndex),
debugName_(debugName)
{
vkGetDeviceQueue(device, queueFamilyIndex, 0, &queue_);
const VkCommandPoolCreateInfo ci = {
.sType = VK_STRUCTURE_TYPE_COMMAND_POOL_CREATE_INFO,
.flags = VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT |
VK_COMMAND_POOL_CREATE_TRANSIENT_BIT,
.queueFamilyIndex = queueFamilyIndex,
};
vkCreateCommandPool(device, &ci, nullptr, &commandPool_);
const VkCommandBufferAllocateInfo ai = {
.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO,
.commandPool = commandPool_,
.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY,
.commandBufferCount = 1,
};
for (uint32_t i = 0; i != kMaxCommandBuffers; i++) {
CommandBufferWrapper& buf = buffers_[i];
char fenceName[256] = {0};
char semaphoreName[256] = {0};
if (debugName) {
// ... assign debug names to fenceName and semaphoreName
}
buf.semaphore_ = lvk::createSemaphore(device, semaphoreName);
buf.fence_ = lvk::createFence(device, fenceName);
vkAllocateCommandBuffers(
device, &ai, &buf.cmdBufAllocated_);
buffers_[i].handle_.bufferIndex_ = i;
}
}
lvk::VulkanImmediateCommands::~VulkanImmediateCommands() {
waitAll();
for (CommandBufferWrapper& buf : buffers_) {
vkDestroyFence(device_, buf.fence_, nullptr);
vkDestroySemaphore(device_, buf.semaphore_, nullptr);
}
vkDestroyCommandPool(device_, commandPool_, nullptr);
}
Now, let’s examine the implementation of our most important function acquire(). All error checking code is omitted again to keep the explanation clear and focused.
purge() function, which recycles processed command buffers and resets them to their initial state, until at least one buffer becomes available. In practice, this loop almost never runs.
const lvk::VulkanImmediateCommands::CommandBufferWrapper&
lvk::VulkanImmediateCommands::acquire()
{
while (!numAvailableCommandBuffers_) purge();
numAvailableCommandBuffers to ensure proper busy-waiting on the next call to acquire(). The isEncoding member field is used to prevent the reuse of a command buffer that has already been acquired but has not yet been submitted.
VulkanImmediateCommands::CommandBufferWrapper*
current = nullptr;
for (CommandBufferWrapper& buf : buffers_) {
if (buf.cmdBuf_ == VK_NULL_HANDLE) {
current = &buf;
break;
}
}
current->handle_.submitId_ = submitCounter_;
numAvailableCommandBuffers_--;
current->cmdBuf_ = current->cmdBufAllocated_;
current->isEncoding_ = true;
const VkCommandBufferBeginInfo bi = {
.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO,
.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT,
};
VK_ASSERT(vkBeginCommandBuffer(current->cmdBuf_, &bi));
nextSubmitHandle_ = current->handle_;
return *current;
}
purge(), which was mentioned earlier in acquire(). This function calls vkWaitForFences() with a Vulkan fence and a timeout value of 0, which causes it to return the current status of the fence without waiting. If the fence is signaled, we can reset the command buffer and increment numAvailableCommandBuffers. We always begin checking with the oldest submitted buffer and then wrap around.
void lvk::VulkanImmediateCommands::purge() {
const uint32_t numBuffers = LVK_ARRAY_NUM_ELEMENTS(buffers_);
for (uint32_t i = 0; i != numBuffers; i++) {
const uint32_t index = i + lastSubmitHandle_.bufferIndex_+1;
CommandBufferWrapper& buf = buffers_[index % numBuffers];
if (buf.cmdBuf_ == VK_NULL_HANDLE || buf.isEncoding_)
continue;
const VkResult result =
vkWaitForFences(device_, 1, &buf.fence_, VK_TRUE, 0);
if (result == VK_SUCCESS) {
vkResetCommandBuffer(
buf.cmdBuf_, VkCommandBufferResetFlags{0});
vkResetFences(device_, 1, &buf.fence_);
buf.cmdBuf_ = VK_NULL_HANDLE;
numAvailableCommandBuffers_++;
} else {
if (result != VK_TIMEOUT) VK_ASSERT(result);
}
}
}
Another crucial function is submit(), which submits a command buffer to a queue. Let’s take a look.
vkEndCommandBuffer() to finish recording a command buffer.
SubmitHandle lvk::VulkanImmediateCommands::submit(
const CommandBufferWrapper& wrapper) {
vkEndCommandBuffer(wrapper.cmdBuf_);
waitSemaphore() function. It can be an “acquire semaphore” from a swapchain or any other user-provided semaphore if we want to organize a frame graph of some sort. The second semaphore lastSubmitSemaphore_ is the semaphore signaled by a previously submitted command buffer. This ensures all command buffers are processed sequentially one by one.
VkSemaphoreSubmitInfo waitSemaphores[] = {{}, {}};
uint32_t numWaitSemaphores = 0;
if (waitSemaphore_.semaphore)
waitSemaphores[numWaitSemaphores++] = waitSemaphore_;
if (lastSubmitSemaphore_.semaphore)
waitSemaphores[numWaitSemaphores++] = lastSubmitSemaphore_;
signalSemaphores[] are signaled when the command buffer finishes execution. There are two of them: The first is the one we allocated along with our command buffer and is used for chaining command buffers together. The second is an optional timeline semaphore, injected by the signalSemaphore() function. It is injected at the end of the frame, before presenting the final image to the screen, and is used to orchestrate the swapchain presentation.
VkSemaphoreSubmitInfo signalSemaphores[] = {
VkSemaphoreSubmitInfo{
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_SUBMIT_INFO,
.semaphore = wrapper.semaphore_,
.stageMask = VK_PIPELINE_STAGE_ALL_COMMANDS_BIT},
{},
};
uint32_t numSignalSemaphores = 1;
if (signalSemaphore_.semaphore) {
signalSemaphores[numSignalSemaphores++] = signalSemaphore_;
}
vkQueueSubmit2() is straightforward. We populate the VkCommandBufferSubmitInfo structure using VkCommandBuffer from the current CommandBufferWrapper object and add all the semaphores to VkSubmitInfo2, allowing us to synchronize on them during the next submit() call.
const VkCommandBufferSubmitInfo bufferSI = {
.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_SUBMIT_INFO,
.commandBuffer = wrapper.cmdBuf_,
};
const VkSubmitInfo2 si = {
.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO_2,
.waitSemaphoreInfoCount = numWaitSemaphores,
.pWaitSemaphoreInfos = waitSemaphores,
.commandBufferInfoCount = 1u,
.pCommandBufferInfos = &bufferSI,
.signalSemaphoreInfoCount = numSignalSemaphores,
.pSignalSemaphoreInfos = signalSemaphores,
};
vkQueueSubmit2(queue_, 1u, &si, wrapper.fence_);
lastSubmitSemaphore_.semaphore = wrapper.semaphore_;
lastSubmitHandle_ = wrapper.handle_;
waitSemaphore_ and signalSemaphore_ objects have been used, they should be discarded. They are meant to be used with exactly one command buffer. The submitCounter_ variable is used to set the submitId value in the next SubmitHandle. Here’s a trick we use: a SubmitHandle is considered empty when its command buffer and submitId are both zero. A simple way to achieve this is to always skip the zero value of submitCounter, hence double incrementing when we encounter zero.
waitSemaphore_.semaphore = VK_NULL_HANDLE;
signalSemaphore_.semaphore = VK_NULL_HANDLE;
const_cast<CommandBufferWrapper&>(wrapper).isEncoding_ = false;
submitCounter_++;
if (!submitCounter_) submitCounter_++;
return lastSubmitHandle_;
}
This code is already sufficient to manage command buffers in an application. However, let’s take a look at other methods of VulkanImmediateCommands that simplify working with Vulkan fences by hiding them behind SubmitHandle. The next most useful method is isReady(), which serves as our high-level equivalent of vkWaitForFences() with the timeout set to 0.
bool VulkanImmediateCommands::isReady(
const SubmitHandle handle) const
{
if (handle.empty()) return true;
purge() method we explored earlier.
const CommandBufferWrapper& buf =
buffers_[handle.bufferIndex_];
if (buf.cmdBuf_ == VK_NULL_HANDLE) return true;
submitId values would be different. Only after this comparison can we invoke the Vulkan API to check the status of our VkFence object.
if (buf.handle_.submitId_ != handle.submitId_) return true;
return vkWaitForFences(device_, 1, &buf.fence_, VK_TRUE, 0) ==
VK_SUCCESS;
}
The isReady() method provides a simple interface to Vulkan fences, which can be exposed to applications using the LightweightVK library without revealing the actual VkFence objects or the entire mechanism of how VkCommandBuffer objects are submitted and reset.
There is a pair of similar methods that allow us to wait for a specific VkFence hidden behind SubmitHandle.
wait(), and it waits for a single fence to be signaled. Two important points to mention here: We can detect a wait operation on a non-submitted command buffer using the isEncoding_ flag. Also, we call purge() at the end of the function because we are sure there is now at least one command buffer available to be reclaimed. There’s a special shortcut here: if we call wait() with an empty SubmitHandle, it will invoke vkDeviceWaitIdle(), which is often useful for debugging.
void lvk::VulkanImmediateCommands::wait(
const SubmitHandle handle) {
if (handle.empty()) {
vkDeviceWaitIdle(device_);
return;
}
if (isReady(handle)) return;
if (!LVK_VERIFY(!buffers_[handle.bufferIndex_].isEncoding_))
return;
VK_ASSERT(vkWaitForFences(device_, 1,
&buffers_[handle.bufferIndex_].fence_, VK_TRUE, UINT64_MAX));
purge();
}
purge() again to reclaim all completed command buffers.
void lvk::VulkanImmediateCommands::waitAll() {
VkFence fences[kMaxCommandBuffers];
uint32_t numFences = 0;
for (const CommandBufferWrapper& buf : buffers_) {
if (buf.cmdBuf_ != VK_NULL_HANDLE && !buf.isEncoding_)
fences[numFences++] = buf.fence_;
}
if (numFences) VK_ASSERT(vkWaitForFences(
device_, numFences, fences, VK_TRUE, UINT64_MAX));
purge();
}
Those are all the details about the low-level command buffers implementation. Now, let’s take a look at how this code works together with our demo application.
Let’s go all the way back to our demo application Chapter02/01_Swapchain and its main loop. We call the function VulkanContext::acquireCommandBuffer(), which returns a reference to a high-level interface lvk::ICommandBuffer. Then, we call VulkanContext::submit() to submit that command buffer.
while (!glfwWindowShouldClose(window)) {
glfwPollEvents();
glfwGetFramebufferSize(window, &width, &height);
if (!width || !height) continue;
lvk::ICommandBuffer& buf = ctx->acquireCommandBuffer();
ctx->submit(buf, ctx->getCurrentSwapchainTexture());
}
Here’s what is going on inside those functions.
VulkanContext::acquireCommandBuffer() is very simple. It stores a new lvk::CommandBuffer object inside VulkanContext and returns a referent to it. This lightweight object implements the lvk::ICommandBuffer interface and, in the constructor, just calls VulkanImmediateCommands::acquire() we explored earlier.
ICommandBuffer& VulkanContext::acquireCommandBuffer() {
LVK_ASSERT_MSG(!pimpl_->currentCommandBuffer_.ctx_,
"Cannot acquire more than 1 command buffer simultaneously");
pimpl_->currentCommandBuffer_ = CommandBuffer(this);
return pimpl_->currentCommandBuffer_;
}
VulkanContext::submit() is more elaborate. Besides submitting a command buffer, it takes an optional argument of a swapchain texture to be presented. For now, we will skip this part and focus only on the command buffer submission.
void VulkanContext::submit(
const lvk::ICommandBuffer& commandBuffer, TextureHandle present) {
vulkan::CommandBuffer* vkCmdBuffer =
const_cast<vulkan::CommandBuffer*>(
static_cast<const vulkan::CommandBuffer*>(&commandBuffer));
if (present) {
// … do proper layout transitioning for the Vulkan image
}
uint64_t frame counter VulkanSwapchain::currentFrameIndex_, which increments monotonically with each presented frame. We have a specific number of frames in the swapchain—let’s say 3 for example. Then, we can calculate different timeline signal values for each swapchain image so that we wait on these values every 3 frames. We wait for these corresponding timeline values when we want to acquire the same swapchain image the next time, before calling vkAcquireNextImageKHR(). For example, we render frame 0, and the next time we want to acquire it, we wait until the signal semaphore value reaches at least 3. Here, we call the function signalSemaphore() mentioned earlier to inject this timeline signal into our command buffer submission.
const bool shouldPresent = hasSwapchain() && present;
if (shouldPresent) {
const uint64_t signalValue = swapchain_->currentFrameIndex_ +
swapchain_->getNumSwapchainImages();
swapchain_->timelineWaitValues_[
swapchain_->currentImageIndex_] = signalValue;
immediate_->signalSemaphore(timelineSemaphore_, signalValue);
}
vkCmdBuffer->lastSubmitHandle_ =
immediate_->submit(*vkCmdBuffer->wrapper_);
if (shouldPresent) {
swapchain_->present(
immediate_->acquireLastSubmitSemaphore());
}
VulkanImmediateCommands::submit() and use its last submit semaphore to tell the swapchain to wait until the rendering is completed.
vkCmdBuffer->lastSubmitHandle_ =
immediate_->submit(*vkCmdBuffer->wrapper_);
if (shouldPresent) {
swapchain_->present(immediate_->acquireLastSubmitSemaphore());
}
std::packaged_task that should only be run when an associated SubmitHandle, also known as VkFence, is ready. This mechanism is very helpful for managing or deallocating Vulkan resources that might still be in use by the GPU, and will be discussed in subsequent chapters.
processDeferredTasks();
SubmitHandle handle = vkCmdBuffer->lastSubmitHandle_;
pimpl_->currentCommandBuffer_ = {};
return handle;
}
VulkanSwapchain::getCurrentTexture() to see how vkAcquireNextImageKHR() interacts with all the aforementioned semaphores. Here, we wait on the timeline semaphore using the specific signal value for the current swapchain image, which we calculated in the code above. If you’re confused, the pattern here is that for rendering frame N, we wait for the signal value N. After submitting GPU work, we signal the value N+numSwapchainImages.
lvk::TextureHandle lvk::VulkanSwapchain::getCurrentTexture() {
if (getNextImage_) {
const VkSemaphoreWaitInfo waitInfo = {
.sType = VK_STRUCTURE_TYPE_SEMAPHORE_WAIT_INFO,
.semaphoreCount = 1,
.pSemaphores = &ctx_.timelineSemaphore_,
.pValues = &timelineWaitValues_[currentImageIndex_],
};
vkWaitSemaphores(device_, &waitInfo, UINT64_MAX);
vkAcquireNextImageKHR(). After this call, we pass this acquireSemaphore to VulkanImmediateCommands::waitSemaphore() so that we wait on it before submitting the next command buffer that renders into this swapchain image.
VkSemaphore acquireSemaphore =
acquireSemaphore_[currentImageIndex_];
vkAcquireNextImageKHR(device_, swapchain_, UINT64_MAX,
acquireSemaphore, VK_NULL_HANDLE, ¤tImageIndex_);
getNextImage_ = false;
ctx_.immediate_->waitSemaphore(acquireSemaphore);
}
if (LVK_VERIFY(currentImageIndex_ < numSwapchainImages_))
return swapchainTextures_[currentImageIndex_];
return {};
}
Now we have a working subsystem to wrangle Vulkan command buffers and expose VkFence objects to user applications in a clean and straightforward way. We didn’t cover the ICommandBuffer interface in this recipe, but we will address it shortly in this chapter while working on our first Vulkan rendering demo. Before we start rendering, let’s learn how to use compiled SPIR-V shaders from the recipe Compiling Vulkan shaders at runtime in Chapter 1.
We recommend referring to Vulkan Cookbook by Packt for in-depth coverage of swapchain creation and command queues management.
Change the font size
Change margin width
Change background colour