Book Image

Spring Data

By : Petri Kainulainen
Book Image

Spring Data

By: Petri Kainulainen

Overview of this book

Spring Framework has always had a good support for different data access technologies. However, developers had to use technology-specific APIs, which often led to a situation where a lot of boilerplate code had to be written in order to implement even the simplest operations. Spring Data changed all this. Spring Data makes it easier to implement Spring-powered applications that use cloud-based storage services, NoSQL databases, map-reduce frameworks or relational databases. "Spring Data" is a practical guide that is full of step-by-step instructions and examples which ensure that you can start using the Java Persistence API and Redis in your applications without extra hassle. This book provides a brief introduction to the underlying data storage technologies, gives step-by-step instructions that will help you utilize the discussed technologies in your applications, and provides a solid foundation for expanding your knowledge beyond the concepts described in this book. You will learn an easier way to manage your entities and to create database queries with Spring Data JPA. This book also demonstrates how you can add custom functions to your repositories. You will also learn how to use the Redis key-value store as data storage and to use its other features for enhancing your applications. "Spring Data" includes all the practical instructions and examples that provide you with all the information you need to create JPA repositories with Spring Data JPA and to utilize the performance of Redis in your applications by using Spring Data Redis.
Table of Contents (13 chapters)

Redis


Redis is an in-memory data store that keeps its entire data set in a memory and uses disk space only as a secondary persistent storage. Therefore, Redis can provide very fast read and write operations. The catch is that the size of the Redis data set cannot be higher than the amount of memory. The other features of Redis include:

  • Support for complex data types

  • Multiple persistence mechanisms

  • Master-slave replication

  • Implementation of the publish/subscribe messaging pattern

These features are described in the following subsections.

Supported data types

Each value stored by Redis has a key. Both keys and values are binary safe, which means that the key or the stored value can be either a string or the content of a binary file. However, Redis is more than just a simple key-value store. It supports multiple binary safe data types, which should be familiar to every programmer. These data types are as follows:

  • String: This is a data type where one key always refers to a single value.

  • List: This is a data type where one key refers to multiple string values, which are sorted in insertion order.

  • Set: This is a collection of unordered strings that cannot contain the same value more than once.

  • Sorted set: This is similar to a set but each of its values has a score which is used to order the values of a sorted set from the lowest score to the highest. The same score can be assigned to multiple values.

  • Hash: This is a data type where a single hash key always refers to a specific map of string keys and values.

Persistence

Redis supports two persistence mechanisms that can be used to store the data set on disk. They are as follows:

  • RDB is the simplest persistence mechanism of Redis. It takes snapshots from the in-memory data sets at configured intervals, and stores the snapshot on disk. When a server is started, it will read the data set back to the memory from the snapshot file. This is the default persistence mechanism of Redis.

    RDB maximizes the performance of your Redis server, and its file format is really compact, which makes it a very useful tool for disaster recovery. Also, if you want to use the master-slave replication, you have to use RDB because the RDB snapshots are used when the data is synchronized between the master and the slaves.

    However, if you have to minimize the chance of data loss in all situations, RDB is not the right solution for you. Because RDB persists the data at configured intervals, you can always lose the data stored in to your Redis instance after the last snapshot was saved to a disk.

  • Append Only File (AOF) is a persistence model, which logs each operation changing the state of the in-memory data set to a specific log file. When a Redis instance is started, it will reconstruct the data set by executing all operations found from the log file.

    The advantage of the AOF is that it minimizes that chance of data loss in all situations. Also, since the log file is an append log, it cannot be irreversibly corrupted. On the other hand, AOF log files are usually larger than RDB files for the same data, and AOF can be slower than RDB if the server is experiencing a huge write load.

You can also enable both persistence mechanisms and get the best of both worlds. You can use RDB for creating backups of your data set and still ensure that your data is safe. In this case, Redis will use the AOF log file for building the data set on a server startup because it is most likely that it contains the latest data.

If you are using Redis as a temporary data storage and do not need persistency, you can disable both persistence mechanisms. This means that the data sets will be destroyed when the server is shut down.

Replication

Redis supports master-slave replication where a single master can have one or multiple slaves. Each slave is an exact copy of its master, and it can connect to both master and other slaves. In other words, a slave can be a master of other slaves. Since Redis 2.6, each slave is read-only by default, and all write operations to a slave are rejected. If we need to store temporary information to a slave, we have to configure that slave to allow write operations.

Replication is non-blocking on both sides. It will not block the queries made to the master even when a slave or slaves are synchronizing their data for the very first time. Slaves can be configured to serve the old data when they are synchronizing their data with the master. However, incoming connections to a slave will be blocked for a short period of time when the old data is replaced with the new data.

If a slave loses connection to the master, it will either continue serving the old data or return an error to the clients, depending on its configuration. When a connection between master and a slave is lost, the slave will automatically reopen the connection and send a synchronization request to the master.

Publish/subscribe messaging pattern

The publish/subscribe messaging pattern is a messaging pattern where the message sender (publisher) does not send messages directly to the receiver (subscriber). Instead, an additional element called a channel is used to transport messages from the publisher to the subscriber. Publishers can send a message to one or more channels. Subscribers can select the interesting channels and receive messages sent to these channels by subscribing to those channels.

Let's think of a situation where a single publisher is publishing messages to two channels, Channel 1 and Channel 2. Channel 1 has two subscribers: Subscriber 1 and Subscriber 2. Channel 2 also has two subscribers: Subscriber 2 and Subscriber 3. This situation is illustrated in the following figure:

The publish/subscribe pattern ensures that the publishers are not aware of the subscribers and vice versa. This gives us the possibility to divide our application into smaller modules, which have loose coupling between them. This makes the modules easier to maintain and replace if needed.

However, the greatest advantage of the publish/subscribe pattern is also its greatest weakness. Firstly, our application cannot rely on the fact that a specific component has subscribed to a specific channel. Secondly, there is no clean way for us to verify if this is the case. In fact, our application cannot assume that anyone is listening.

Redis offers a solid support for the publish/subscribe pattern. The main features of its publish/subscribe implementation are:

  • Publishers can publish messages to one or more channels at the same time

  • Subscribers can subscribe to the interesting channels by using the name of the channel or a pattern containing a wildcard

  • Unsubscribing from channels also supports both name and pattern matching