As soon as you experience the first performance bottleneck in your application, there are two steps to carry out: analyze the problem and fix it. If you application design is not completely flawed, fixing usually takes a fraction of the time needed for analyzing. The analysis is split in two parts. Finding the occurrence of the problem in your platform and reproducing the problem reliably in order to fix it. Most likely problems will occur on production because real live problems will never be completely found by lab testing.
After you have fixed the problem, hopefully the operations department or even you would raise the rhetorical question about making sure this does not happen again. The answer is simple: Monitoring. You should always be able to return reliable statistics from the core of the application instead of relying on external measuring points such as database query times or HTTP request and response times.
There should be data about cache hit...