Book Image

Mastering Python High Performance

Book Image

Mastering Python High Performance

Overview of this book

Table of Contents (15 chapters)

Memory consumption and memory leaks

Another very important resource to consider when developing software is memory. Regular software developers don't really care much about it, since the era of the 640 KB of RAM PC is long dead. However, a memory leak on a long-running program can turn any server into a 640 KB computer. Memory consumption is not just about having enough memory for your program to run; it's also about having control over the memory that your programs use.

There are some developments, such as embedded systems, that actually require developers to pay extra attention to the amount of memory they use, because it is a limited resource in those systems. However, an average developer can expect their target system to have the amount of RAM they require.

With RAM and higher level languages that come with automatic memory management (like garbage collection), the developer is less likely to pay much attention to memory utilization, trusting the platform to do it for them.

Keeping track of memory consumption is relatively straightforward. At least for a basic approach, just use your OS's task manager. It'll display, among other things, the amount of memory used or at least the percentage of total memory used by your program. The task manager is also a great tool to check your CPU time consumption. As you can see in the next screenshot, a simple Python program (the preceding one) is taking up almost the entire CPU power (99.8 percent), and barely 0.1 percent of the total memory that is available:

With a tool like that (the top command line tool from Linux), spotting memory leaks can be easy, but that will depend on the type of software you're monitoring. If your program is constantly loading data, its memory consumption rate will be different from another program that doesn't have to deal much with external resources.

For instance, if we were to chart the memory consumption over time of a program dealing with lots of external data, it would look like the following chart:

There will be peaks, when these resources get fully loaded into memory, but there will also be some drops, when those resources are released. Although the memory consumption numbers fluctuate quite a bit, it's still possible to estimate the average amount of memory that the program will use when no resources are loaded. Once you define that area (marked as a green box in the preceding chart), you can spot memory leaks.

Let's look at how the same chart would look with bad resource handling (not fully releasing allocated memory):

In the preceding chart, you can clearly see that not all memory is released when a resource is no longer used, which is causing the line to move out of the green box. This means the program is consuming more and more memory every second, even when the resources loaded are released.

The same can be done with programs that aren't resource heavy, for instance, scripts that execute a particular processing task for a considerable period of time. In those cases, the memory consumption and the leaks should be easier to spot.

Let's take a look at an example:

When the processing stage starts, the memory consumption should stabilize within a clearly defined range. If we spot numbers outside that range, especially if it goes out of it and never comes back, we're looking at another example of a memory leak.

Let's look at an example of such a case: