Book Image

HLSL Development Cookbook

By : Doron Feinstein
Book Image

HLSL Development Cookbook

By: Doron Feinstein

Overview of this book

3D graphics are becoming increasingly more realistic and sophisticated as the power of modern hardware improves. The High Level Shader Language (HLSL) allows you to harness the power of shaders within DirectX 11, so that you can push the boundaries of 3D rendering like never before.HLSL Development Cookbook will provide you with a series of essential recipes to help you make the most out of different rendering techniques used within games and simulations using the DirectX 11 API.This book is specifically designed to help build your understanding via practical example. This essential Cookbook has coverage ranging from industry-standard lighting techniques to more specialist post-processing implementations such as bloom and tone mapping. Explained in a clear yet concise manner, each recipe is also accompanied by superb examples with full documentation so that you can harness the power of HLSL for your own individual requirements.
Table of Contents (13 chapters)

Multiple light types in a single pass


In this chapter's introduction, we saw that the performance can be optimized by combining multiple lights into a single draw call. The GPU registers are big enough to contain four float values. We can take advantage of that size and calculate four lights at a time.

Unfortunately, one drawback of this approach is the lack of support for projected textures. One way around this issue is to render light sources that use projected textures separately from the rest of the light sources. This limit may not be all that bad depending on the rendered scene light setup.

Getting ready

Unlike the previous examples, this time you will have to send light data to the GPU in groups of four. Since it's not likely that all four light sources are going to be of the same type, the code handles all three light types in a generic way. In cases where the drawn mesh is affected by less than four lights, you can always disable lights by turning their color to fully black.

How to do it...

In order to take full advantage of the GPU's vectorized math operations, all the light source values are going to be packed in groups of four. Here is a simple illustration that explains how four-three component variables can be packed into three-four component variables:

All variable packing should be done on the CPU. Keeping in mind that the constant registers of the GPU are the size of four floats, this packing is more efficient compared to the single light version, where most of the values use only three floats and waste the forth one.

Light positions X, Y, and Z components of each of the four light sources are packed into the following shader constants:

float4 LightPosX;
float4 LightPosY;
float4 LightPosZ;

Light directions are separated to X, Y, and Z components as well. This group of constants is used for both, spot and capsule light source directions. For point lights make sure to set the respected value in each constant to 0:

float4 LightDirX;
float4 LightDirY;
float4 LightDirZ;

Light color is separated to R, G, and B components. For disabled light sources just set the respected values to 0:

float4 LightColorR;
float4 LightColorG;
float4 LightColorB;

As before, you should combine the color and intensity of each light before passing the values to the GPU.

All four light ranges are stored in a single four-component constant:

float4 LightRange;

All four lights' capsule lengths are stored in a single four-component constant. For noncapsule lights just store the respected value to 0:

float4 CapsuleLen;

Spot light's cosine outer cone angle is again stored in a four-component constant. For nonspot light sources set the respected value to -2:

float4 SpotCosOuterCone;

Unlike the single spot light, for the inner cone angle we are going to store one over the spot light's cosine inner cone angle. For nonspot light sources set the respected value to 1:

float4 SpotCosInnerConeRcp;

We are going to use two new helper functions that will help us calculate the dot product of four component vectors. The first one calculates the dot product between two groups of three-four component variable. The return value is a four-component variable with the four-dot product values. The code is as follows:

float4 dot4x4(float4 aX, float4 aY, float4 aZ, float4 bX, float4 bY, float4 bZ)
{
   return aX * bX + aY * bY + aZ * bZ;
}

The second helper function calculates the dot product of three-four component variables with a single three-component variable:

float4 dot4x1(float4 aX, float4 aY, float4 aZ, float3 b)
{
   return aX * b.xxxx + aY * b.yyyy + aZ * b.zzzz;
}

Finally, the code to calculate the lighting for the four light sources is as follows:

   float3 ToEye = EyePosition.xyz - position;
   
   // Find the shortest distance between the pixel and capsules segment
   float4 ToCapsuleStartX = position.xxxx - LightPosX;
   float4 ToCapsuleStartY = position.yyyy - LightPosY;
   float4 ToCapsuleStartZ = position.zzzz - LightPosZ;
   float4 DistOnLine = dot4x4(ToCapsuleStartX, ToCapsuleStartY, ToCapsuleStartZ, LightDirX, LightDirY, LightDirZ);
   float4 CapsuleLenSafe = max(CapsuleLen, 1.e-6);
   DistOnLine = CapsuleLen * saturate(DistOnLine / CapsuleLenSafe);
   float4 PointOnLineX = LightPosX + LightDirX * DistOnLine;
   float4 PointOnLineY = LightPosY + LightDirY * DistOnLine;
   float4 PointOnLineZ = LightPosZ + LightDirZ * DistOnLine;
   float4 ToLightX = PointOnLineX - position.xxxx;
   float4 ToLightY = PointOnLineY - position.yyyy;
   float4 ToLightZ = PointOnLineZ - position.zzzz;
   float4 DistToLightSqr = dot4x4(ToLightX, ToLightY, ToLightZ, ToLightX, ToLightY, ToLightZ);
   float4 DistToLight = sqrt(DistToLightSqr);
   
   // Phong diffuse
   ToLightX /= DistToLight; // Normalize
   ToLightY /= DistToLight; // Normalize
   ToLightZ /= DistToLight; // Normalize
   float4 NDotL = saturate(dot4x1(ToLightX, ToLightY, ToLightZ, material.normal));
   //float3 finalColor = float3(dot(LightColorR, NDotL), dot(LightColorG, NDotL), dot(LightColorB, NDotL));
   
   // Blinn specular
   ToEye = normalize(ToEye);
   float4 HalfWayX = ToEye.xxxx + ToLightX;
   float4 HalfWayY = ToEye.yyyy + ToLightY;
   float4 HalfWayZ = ToEye.zzzz + ToLightZ;
   float4 HalfWaySize = sqrt(dot4x4(HalfWayX, HalfWayY, HalfWayZ, HalfWayX, HalfWayY, HalfWayZ));
   float4 NDotH = saturate(dot4x1(HalfWayX / HalfWaySize, HalfWayY / HalfWaySize, HalfWayZ / HalfWaySize, material.normal));
   float4 SpecValue = pow(NDotH, material.specExp.xxxx) * material.specIntensity;
   //finalColor += float3(dot(LightColorR, SpecValue), dot(LightColorG, SpecValue), dot(LightColorB, SpecValue));
   
   // Cone attenuation
   float4 cosAng = dot4x4(LightDirX, LightDirY, LightDirZ, ToLightX, ToLightY, ToLightZ);
   float4 conAtt = saturate((cosAng - SpotCosOuterCone) * SpotCosInnerConeRcp);
   conAtt *= conAtt;
   
   // Attenuation
   float4 DistToLightNorm = 1.0 - saturate(DistToLight * LightRangeRcp);
   float4 Attn = DistToLightNorm * DistToLightNorm;
   Attn *= conAtt; // Include the cone attenuation

   // Calculate the final color value
   float4 pixelIntensity = (NDotL + SpecValue) * Attn;
   float3 finalColor = float3(dot(LightColorR, pixelIntensity), dot(LightColorG, pixelIntensity), dot(LightColorB, pixelIntensity));
   finalColor *= material.diffuseColor;
   
   return finalColor;

How it works…

Though this code is longer than the one used in the previous recipes, it basically works in the exact same way as in the single light source case. In order to support the three different light sources in a single code path, both the capsule light's closest point to line and the spot lights cone attenuation are used together.

If you compare the single light version of the code with the multiple lights version, you will notice that all the operations are done in the exact same order. The only change is that each operation that uses the packed constants has to be executed three times and the result has to be combined into a single four-component vector.

There's more…

Don't think that you are limited to four lights at a time just because of the GPU's constant size. You can rewrite CalcFourLights to take the light constant parameters as inputs, so you could call this function more than once in a shader.

Some scenes don't use all three light types. You can remove either the spot or capsule light support if those are not needed (point lights are at the base of the code, so those have to be supported). This will reduce the shader size and improve performance.

Another possible optimization is to combine the ambient, directional, and multiple lights code into a single shader. This will reduce the total amount of draw calls needed and will improve performance.