Book Image

Infinispan data grid platform definitive guide

Book Image

Infinispan data grid platform definitive guide

Overview of this book

Table of Contents (20 chapters)
Infinispan Data Grid Platform Definitive Guide
Credits
About the Author
Acknowledgments
About the Reviewers
www.PacktPub.com
Preface
Index

Introducing the Infinispan data grid


Generally, data grids are considered an evolution of distributed caches. As the name implies, a distributed cache is characterized by the usage of multiple servers to host cached data, so that it can grow in size and in capacity. A distributed cache solution is mainly used to store temporary application data, such as web session data.

There are several benefits to using an in-memory data grid like Infinispan; a good and simple example is database offloading. Database offloading is the process of removing or reducing the database load, mainframes, and shared or partner services.

As mentioned in the introduction, Cloud platforms are also changing the way we store and distribute our data. A common solution adopted on cloud architectures is to expose the data through REST services.

In these cloud architectures, applications that expose the business data via REST services to clients normally decompose the system into layers, in order to separate the user interface from the business logic, as illustrated in the following image:

In the image, you can see how the data layer fits in a layered application. In our example, the data layer hides the implementation details to the database. We can consider that the database is a single repository of data, which is accessed at first by a few clients. However, that's the scenario if you're on a cloud environment, things can change fast.

From the performance point of view, retrieving data for every request from one single repository can be expensive, both in terms of time and hardware cost.

Also, you may need to provide access to your data for many clients' social media and mobile device integration, which can decrease the performance of your services, thereby impacting user experience.

This scenario can be a good example of offloading; in our scenario, we designed our database to provide a certain amount of data via REST services, which does not only impact the response time of the database request, but can also impact the whole infrastructure where the data is being transmitted.

Infinispan can work well in very demanding environments, as in this scenario. Thoughtful use of a distributed caching to offload the network and database can really improve performance. For instance, when a client makes a web method call to the service layer, which in turn performs several calls to the database, you could cache the query results in an Infinispan cache. Then, the next time this client needs to make the same Web method call, the service layer gets that data from the cache instead. This step can really improve overall performance because the application does not have to make an expensive database call. An Infinispan cache, in this scenario, can also reduce the pressure on the service layer, because data in an Infinispan cache can be distributed among all nodes of the cluster.

The following figure presents the same architecture as included in the first image, but now with an Infinispan cache:

At the same time, Infinispan can also power embedded applications, be used as a JPA/Hibernate second-level cache provider, or as a highly available key/value data store.

The scenario we presented previously is one of the two possible ways you can interact with Infinispan. We are using Infinispan in the embedded mode, which initiates the Infinispan data grid within the same JVM as the application.

The other way is the client/server mode, where you have an Infinispan data grid instance running in a separated server, which delivers and manages the grid content to be consumed by the client, including non-java clients, such as C++, Python, and .NET.

Tip

We will cover Infinispan Server in detail in Chapter 9, Server Modules.

Infinispan also provides an extremely low latency access to the cache, and high availability of the application data, by keeping the data in the memory and distributing it to several nodes of the grid, which makes your application able to load terabytes of data into memory. However, it not only provides a new attempt to use the main memory as a storage area instead of a disk, (Infinispan perform much faster than disk-based databases) but it also provides features such as:

  • Data partitioning across a cluster

  • Work with domain objects rather than only bytes, arrays, or strings

  • Synchronous and asynchronous operations throughout

  • Distributed ACID transactions

  • Data distribution through the use of a consistent hash algorithm to determine where keys should be located in the cluster

  • Write-through/behind cache store support

  • Eviction support

  • Elastic scaling

  • Multiple access protocols

  • Support for compute grids

  • Persisting state to configurable cache stores

We will cover all these features in detail throughout this chapter.

From a Java developer perspective, an Infinispan cache can be seen as a distributed key-value object store similar in its interface to a typical concurrent hash map, in a way that it can have any application domain object as either a value or a key.