Book Image

3D Graphics Rendering Cookbook

By : Sergey Kosarevsky, Viktor Latypov
4 (2)
Book Image

3D Graphics Rendering Cookbook

4 (2)
By: Sergey Kosarevsky, Viktor Latypov

Overview of this book

OpenGL is a popular cross-language, cross-platform application programming interface (API) used for rendering 2D and 3D graphics, while Vulkan is a low-overhead, cross-platform 3D graphics API that targets high-performance applications. 3D Graphics Rendering Cookbook helps you learn about modern graphics rendering algorithms and techniques using C++ programming along with OpenGL and Vulkan APIs. The book begins by setting up a development environment and takes you through the steps involved in building a 3D rendering engine with the help of basic, yet self-contained, recipes. Each recipe will enable you to incrementally add features to your codebase and show you how to integrate different 3D rendering techniques and algorithms into one large project. You'll also get to grips with core techniques such as physically based rendering, image-based rendering, and CPU/GPU geometry culling, to name a few. As you advance, you'll explore common techniques and solutions that will help you to work with large datasets for 2D and 3D rendering. Finally, you'll discover how to apply optimization techniques to build performant and feature-rich graphics applications. By the end of this 3D rendering book, you'll have gained an improved understanding of best practices used in modern graphics APIs and be able to create fast and versatile 3D rendering frameworks.
Table of Contents (12 chapters)

Introducing MeshOptimizer 

For GPUs to render a mesh efficiently, all vertices in the vertex buffer should be unique and without duplicates. Solving this problem efficiently can be a complicated and computationally intensive task in any modern 3D content pipeline.

MeshOptimizer is an open source C++ library developed by Arseny Kapoulkine, which provides algorithms to help optimize meshes for modern GPU vertex and index processing pipelines. It can reindex an existing index buffer or generate an entirely new set of indices from an unindexed vertex buffer.

Getting ready

We use MeshOptimizer version 0.16. Here is the Bootstrap snippet that you can use to download this version:

{
  "name": "meshoptimizer",
  "source": {
    "type": "git",
    "url": "https://github.com/zeux/meshoptimizer",
    "revision": "v0.16"
  }
}

The complete source code for this recipe can be found in Chapter2/10_MeshOptimizer.

How to do it...

Let's use MeshOptimizer to optimize the vertex and index buffer layouts of a mesh loaded by the Assimp library. Then, we can generate a simplified model of the mesh:

  1. First, we load our mesh via Assimp, as shown in the following code snippet. We preserve the existing vertices and indices exactly as they were loaded by Assimp:
    const aiScene* scene = aiImportFile(  "data/rubber_duck/scene.gltf",  aiProcess_Triangulate);
    const aiMesh* mesh = scene->mMeshes[0];
    std::vector<vec3> positions;
    std::vector<unsigned int> indices;
    for (unsigned i = 0; i != mesh->mNumVertices; i++) {
      const aiVector3D v = mesh->mVertices[i];
      positions.push_back( vec3(v.x, v.z, v.y) );
    }
    for (unsigned i = 0; i != mesh->mNumFaces; i++) {
      for ( unsigned j = 0; j != 3; j++ )
         indices.push_back(mesh->mFaces[i].mIndices[j]);
    }
    aiReleaseImport(scene);
  2. Now we should generate a remap table for our existing vertex and index data:
    std::vector<unsigned int> remap( indices.size() );
    const size_t vertexCount =  meshopt_generateVertexRemap( remap.data(),    indices.data(), indices.size(), positions.data(),    indices.size(), sizeof(vec3) );

    The MeshOptimizer documentation (https://github.com/zeux/meshoptimizer) tells us the following:

    The remap table is generated based on binary equivalence of the input vertices, so the resulting mesh will be rendered in the same way.

  3. The returned vertexCount value corresponds to the number of unique vertices that have remained after remapping. Let's allocate space and generate new vertex and index buffers:
    std::vector<unsigned int> remappedIndices(  indices.size() );
    std::vector<vec3> remappedVertices( vertexCount );
    meshopt_remapIndexBuffer( remappedIndices.data(),  indices.data(), indices.size(), remap.data() );
    meshopt_remapVertexBuffer( remappedVertices.data(),  positions.data(), positions.size(), sizeof(vec3),  remap.data() );

    Now we can use other MeshOptimizer algorithms to optimize these buffers even further. The official documentation is pretty straightforward. We will adapt the example it provides for the purposes of our demo application.

  4. When we want to render a mesh, the GPU has to transform each vertex via a vertex shader. GPUs can reuse transformed vertices by means of a small built-in cache, usually storing between 16 and 32 vertices inside it. In order to use this small cache effectively, we need to reorder the triangles to maximize the locality of vertex references. How to do this with MeshOptimizer in place is shown next. Pay attention to how only the indices data is being touched here:
    meshopt_optimizeVertexCache( remappedIndices.data(),  remappedIndices.data(), indices.size(), vertexCount );
  5. Transformed vertices form triangles that are sent for rasterization to generate fragments. Usually, each fragment is run through a depth test first, and fragments that pass the depth test get the fragment shader executed to compute the final color. As fragment shaders get more and more expensive, it becomes increasingly important to reduce the number of fragment shader invocations. This can be achieved by reducing pixel overdraw in a mesh, and, in general, it requires the use of view-dependent algorithms. However, MeshOptimizer implements heuristics to reorder the triangles and minimize overdraw from all directions. We can use it as follows:
    meshopt_optimizeOverdraw( remappedIndices.data(),  remappedIndices.data(),  indices.size(),  glm::value_ptr(remappedVertices[0]),  vertexCount,  sizeof(vec3),  1.05f );

    The last parameter, 1.05, is the threshold that determines how much the algorithm can compromise the vertex cache hit ratio. We use the recommended default value from the documentation.

  6. Once we have optimized the mesh to reduce pixel overdraw, the vertex buffer access pattern can still be optimized for memory efficiency. The GPU has to fetch specified vertex attributes from the vertex buffer and pass this data into the vertex shader. To speed up this fetch, a memory cache is used, which means optimizing the locality of vertex buffer access is very important. We can use MeshOptimizer to optimize our index and vertex buffers for vertex fetch efficiency, as follows:
    meshopt_optimizeVertexFetch(  remappedVertices.data(),  remappedIndices.data(),  indices.size(),  remappedVertices.data(),  vertexCount,  sizeof(vec3) );

    This function will reorder vertices in the vertex buffer and regenerate indices to match the new contents of the vertex buffer.

  7. The last thing we will do in this recipe is simplify the mesh. MeshOptimizer can generate a new index buffer that uses existing vertices from the vertex buffer with a reduced number of triangles. This new index buffer can be used to render Level-of-Detail (LOD) meshes. The following code snippet shows you how to do this using the default threshold and target error values:
    const float threshold = 0.2f;
    const size_t target_index_count = size_t(  remappedIndices.size() * threshold);
    const float target_error = 1e-2f;
    std::vector<unsigned int> indicesLod(  remappedIndices.size() );
    indicesLod.resize( meshopt_simplify(  &indicesLod[0],  remappedIndices.data(), remappedIndices.size(),  &remappedVertices[0].x, vertexCount, sizeof(vec3),  target_index_count, target_error) );

Multiple LOD meshes can be generated this way by changing the threshold value.

Let's render the optimized and LOD meshes that we created earlier:

  1. For the simplicity of this demo, we copy the remapped data back into the original vectors as follows:
    indices = remappedIndices;
    positions = remappedVertices;
  2. With modern OpenGL, we can store vertex and index data inside a single buffer. You can do this as follows:
    const size_t sizeIndices =  sizeof(unsigned int) * indices.size();
    const size_t sizeIndicesLod =  sizeof(unsigned int) * indicesLod.size();
    const size_t sizeVertices =  sizeof(vec3) * positions.size();
    glNamedBufferStorage(meshData,  sizeIndices + sizeIndicesLod + sizeVertices,  nullptr, GL_DYNAMIC_STORAGE_BIT);
    glNamedBufferSubData(  meshData, 0, sizeIndices, indices.data());
    glNamedBufferSubData(meshData, sizeIndices,  sizeIndicesLod, indicesLod.data());
    glNamedBufferSubData(meshData, sizeIndices +  sizeIndicesLod, sizeVertices, positions.data());
  3. Now we should tell OpenGL where to read the vertex and index data from. The starting offset to the vertex data is sizeIndices + sizeIndicesLod:
    glVertexArrayElementBuffer(VAO, meshData);
    glVertexArrayVertexBuffer(VAO, 0, meshData,  sizeIndices + sizeIndicesLod, sizeof(vec3));
    glEnableVertexArrayAttrib(VAO, 0);
    glVertexArrayAttribFormat(  VAO, 0, 3, GL_FLOAT, GL_FALSE, 0);
    glVertexArrayAttribBinding(VAO, 0, 0);
  4. To render the optimized mesh, we can call glDrawElements(), as follows:
    glDrawElements(GL_TRIANGLES, indices.size(),  GL_UNSIGNED_INT, nullptr);
  5. To render the simplified LOD mesh, we use the number of indices in the LOD and use an offset to where its indices start in the index buffer. We need to skip sizeIndices bytes to do it:
    glDrawElements(GL_TRIANGLES, indicesLod.size(),  GL_UNSIGNED_INT, (void*)sizeIndices);

The resulting image should look similar to the following screenshot:

Figure 2.8 – LOD mesh rendering

Figure 2.8 – LOD mesh rendering

There's more...

This recipe uses a slightly different technique for the wireframe rendering. Instead of rendering a mesh twice, we use barycentric coordinates to identify the proximity of the triangle edge inside each triangle and change the color accordingly. Here is the geometry shader to generate barycentric coordinates for a triangular mesh:

#version 460 core
layout( triangles ) in;
layout( triangle_strip, max_vertices = 3 ) out;
layout (location=0) in vec3 color[];
layout (location=0) out vec3 colors;
layout (location=1) out vec3 barycoords;
void main()
{

Next, store the values of the barycentric coordinates for each vertex of the triangle:

  const vec3 bc[3] = vec3[](
    vec3(1.0, 0.0, 0.0),
    vec3(0.0, 1.0, 0.0),
    vec3(0.0, 0.0, 1.0)
  );
  for ( int i = 0; i < 3; i++ )
  {
     gl_Position = gl_in[i].gl_Position;
     colors = color[i];
     barycoords = bc[i];
     EmitVertex();
  }
  EndPrimitive();
}

Barycentric coordinates can be used inside the fragment shader to discriminate colors in the following way:

#version 460 core
layout (location=0) in vec3 colors;
layout (location=1) in vec3 barycoords;
layout (location=0) out vec4 out_FragColor;
float edgeFactor(float thickness)
{
  vec3 a3 = smoothstep( vec3(0.0), fwidth(barycoords) *    thickness,barycoords );
  return min( min(a3.x, a3.y), a3.z );
}
void main()
{
  out_FragColor =    vec4(mix(vec3(0.0), colors, edgeFactor(1.0)), 1.0);
};

The fwidth() function calculates the sum of the absolute values of the derivatives in the x and y screen coordinates and is used to determine the thickness of the lines. The smoothstep() function is used for antialiasing.