Poor rendering performance can manifest itself in a number of ways, depending on whether the device is limited by CPU activity (we are CPU bound) or by GPU activity (we are GPU bound). Investigating a CPU-bound application can be relatively simple since all of the CPU work is wrapped up in loading data from disk/memory and calling Graphics API instructions. However, a GPU-bound application can be more difficult to analyze since the root cause could originate from one of a large number of potential places within the Rendering Pipeline. We might find that we need to rely on a little guesswork or process of elimination in order to determine the source of a GPU bottleneck. In either case, once the problem is discovered and resolved, we can expect significant improvements since small fixes tend to reap big rewards when it comes to fixing issues in the Rendering Pipeline.
We briefly touched on the Rendering Pipeline in Chapter 3, The Benefits of Batching. To briefly summarize...