The cost of abstractions in terms of data size plays an important role. For example, whether or not a data element can fit into a processor cache line depends directly upon its size. On a Linux system, we can find out the cache line size and other parameters by inspecting the values in the files under /sys/devices/system/cpu/cpu0/cache/
. Refer to Chapter 4, Host Performance, where we discussed how to compute the size of primitives, objects, and data elements.
Another concern we generally find with data sizing is how much data we are holding at a time in the heap. As we noted in earlier chapters, GC has direct consequences on the application's performance. While processing data, often we do not really need all the data we hold on to. Consider the example of generating a summary report of sold items for a certain period (months) of time. After the subperiod (month wise), summary data is computed. We do not need the item details anymore, hence it's better to remove the unwanted data...