Book Image

Getting Started with Hazelcast

By : Matthew Johns
Book Image

Getting Started with Hazelcast

By: Matthew Johns

Overview of this book

Table of Contents (18 chapters)
Getting Started with Hazelcast
About the Author
About the Reviewers

Therein lies the problem

However, in insulating the database from the read load, we have introduced a problem in the form of a cache consistency issue, in that, how does our local data cache deal with changing data underneath it within the primary database? The answer is rather depressing: it can't! The exact manifestation of any issues will largely depend on the data needs of the application and how frequently the data changes; but typically, caching systems will operate in one of the two following modes to combat the problem:

  • Time bound cache: Holds entries for a defined period (time-to-live or TTL)

  • Write through cache: Holds entries until they are invalidated by subsequent updates

Time bound caches almost always have consistency issues, but at least the amount of time that the issue would be present is limited to the expiry time of each entry. However, we must consider the application's access to this data, because if the frequency of accessing a particular entry is less than the cache expiry time of it, the cache is providing no real benefit.

Write through caches are consistent in isolation and can be configured to offer strict consistency, but if multiple write through caches exist within the overall architecture, then there will be consistency issues between them. We can avoid this by having a more intelligent cache, which features a communication mechanism between nodes, that can propagate entry invalidations to each other.

In practice, an ideal cache would feature a combination of both features, so that entries would be held for a known maximum time, but also passes around invalidations as changes are made.

So our evolved architecture would look a bit like the following figure:

So far we've had a look through the general issues in scaling our data layer, and introduced strategies to help combat the trade-offs we will encounter along the way; however, the real world isn't quite as simple. There are various cache servers and in-memory database products in this area: however, most of these are stand-alone single instances, perhaps with some degree of distribution bolted on or provided by other supporting technologies. This tends to bring about the same issues we experienced with just our primary database, in that we could encounter resource saturation or capacity issues if the product is a single instance, or if the distribution doesn't provide consistency control, perhaps inconsistent data, which might harm our application.