Book Image

Hands-On Reactive Programming in Spring 5

By : Oleh Dokuka, Igor Lozynskyi
Book Image

Hands-On Reactive Programming in Spring 5

By: Oleh Dokuka, Igor Lozynskyi

Overview of this book

These days, businesses need a new type of system that can remain responsive at all times. This is achievable with reactive programming; however, the development of these kinds of systems is a complex task, requiring a deep understanding of the domain. In order to develop highly responsive systems, the developers of the Spring Framework came up with Project Reactor. Hands-On Reactive Programming in Spring 5 begins with the fundamentals of Spring Reactive programming. You’ll explore the endless possibilities of building efficient reactive systems with the Spring 5 Framework along with other tools such as WebFlux and Spring Boot. Further on, you’ll study reactive programming techniques and apply them to databases and cross-server communication. You will advance your skills in scaling up Spring Cloud Streams and run independent, high-performant reactive microservices. By the end of the book, you will be able to put your skills to use and get on board with the reactive revolution in Spring 5.1!
Table of Contents (12 chapters)

Why reactive?

Nowadays, reactive is a buzzword—so exciting but so confusing. However, should we still care about reactivity even if it takes an honorable place in conferences around the world? If we google the word reactive, we will see that the most popular association is programming, in which it defines the meaning of a programming model. However, that is not the only meaning for reactivity. Behind that word, there are hidden fundamental design principles aimed at building a robust system. To understand the value of reactivity as an essential design principle, let's imagine that we are developing a small business.

Suppose our small business is a web store with a few cutting-edge products at an attractive price. As is the case with the majority of projects in this sector, we will hire software engineers to solve any problems that we encounter. We opted for the traditional approaches to development, and, during a few development interactions, we created our store.

Usually, our service is visited by about one thousand users per hour. To serve the usual demand, we bought a modern computer and ran the Tomcat web server as well as configuring Tomcat's thread pool with 500 allocated threads. The average response time for the majority of user requests is about 250 milliseconds. By doing a naive calculation of the capacity for that configuration, we can be sure that the system can handle about 2,000 user requests per second. According to statistics, the number of users previously mentioned produced around 1,000 requests per second on average. Consequently, the current system's capacity will be enough for the average load.

To summarize, we configured our application with the margin regarding capacity. Moreover, our web store had been working stably until the last Friday in November, which is Black Friday.

Black Friday is a valuable day for both customers and retailers. For the customer, it is a chance to buy goods at discounted prices. And for retailers, it is a way to earn money and popularize products. However, this day is characterized by an unusual influx of clients, and that may be a significant cause of failure in production.

And, of course, we failed! At some point in time, the load exceeded all expectations. There were no vacant threads in the thread pool to process user requests. In turn, the backup server was not able to handle such an unpredictable invasion, and, in the end, this caused a rise in the response time and periodic service outage. At this point, we started losing some user requests, and, finally, our clients became dissatisfied and preferred dealing with competitors.

In the end, a lot of potential customers and money were lost, and the store's rating decreased. This was all a result of the fact that we couldn't stay responsive under the increased workload.

But, don't worry, this is nothing new. At one point in time, giants such as Amazon and Walmart also faced this problem and have since found a solution. Nevertheless, we will follow the same roads as our predecessors, gaining an understanding of the central principles of designing robust systems and then providing a general definition for them.

To learn more about giants failures see:

Now, the central question that should remain in our minds is—How should we be responsive? As we might now understand from the example given previously, an application should react to changes. This should include changes in demand (load) and changes in the availability of external services. In other words, it should be reactive to any changes that may affect the system's ability to respond to user requests.

One of the first ways to achieve the primary goal is through elasticity. This describes the ability to stay responsive under a varying workload, meaning that the throughput of the system should increase automatically when more users start using it and it should decrease automatically when the demand goes down. From the application perspective, this feature enables system responsiveness because at any point in time the system can be expanded without affecting the average latency.

Note that latency is the essential characteristic of responsiveness. Without elasticity, growing demand will cause the growth of average latency, which directly affects the responsiveness of the system.

For example, by providing additional computation resources or additional instances, the throughput of our system might be increased. The responsiveness will then increase as a consequence. On the other hand, if demand is low, the system should shrink in terms of resource consumption, thereby reducing business expenses. We may achieve elasticity by employing scalability, which might either be horizontal or vertical. However, achieving scalability of the distributed system is a challenge that is typically limited by the introduction of bottlenecks or synchronization points within the system. From the theoretical and practical perspectives, such problems are explained by Amdahl's Law and Gunther's Universal Scalability Model. We will discuss these in Chapter 6, WebFlux Async Non-Blocking Communication.

Here, the term business expenses refers to the cost of additional cloud instances or extra power consumption in the case of physical machines.

However, building a scalable distributed system without the ability to stay responsive regardless of failures is a challenge. Let's think about a situation in which one part of our system is unavailable. Here, an external payment service goes down, and all user attempts to pay for the goods will fail. This is something that breaks the responsiveness of the system, which may be unacceptable in some cases. For example, if users cannot proceed with their purchases easily, they will probably go to a competitor's web store. To deliver a high-quality user experience, we must care about the system's responsiveness. The acceptance criteria for the system are the ability to stay responsive under failures, or, in other words, to be resilient. This may be achieved by applying isolation between functional components of the system, thereby isolating all internal failures and enabling independence. Let's switch back to the Amazon web store. Amazon has many different functional components such as the order list, payment service, advertising service, comment service, and many others. For example, in the case of a payment service outage, we may accept user orders and then schedule a request auto-retry, thereby protecting the user from undesired failures. Another example might be isolation from the comments service. If the comments service goes down, the purchasing and orders list services should not be affected and should work without any problems.

Another point to emphasize is that elasticity and resilience are tightly coupled, and we achieve a truly responsive system only by enabling both. With scalability, we can have multiple replicas of the component so that, if one fails, we can detect this, minimize its impact on the rest of the system, and switch to another replica.

Message-driven communication

The only question that is left unclear is how to connect components in the distributed system and preserve decoupling, isolation, and scalability at the same time. Let's consider communication between components over HTTP. The next code example, doing HTTP communication in Spring Framework 4, represents this concept:

@RequestMapping("/resource")                                       // (1)
public Object processRequest() {
RestTemplate template =
new RestTemplate(); // (2)

ExamplesCollection result = template.getForObject( // (3)
"http://example.com/api/resource2", //
ExamplesCollection.class //
); //

... // (4)

processResultFurther(result); // (5)
}

The previous code is explained as follows:

  1. The code at this point is a request handler mapping declaration that uses the  @RequestMapping annotation.
  2. The code declared in this block shows how we may create the RestTemplate instance. RestTemplate is the most popular web client for doing request-response communication between services in Spring Framework 4.
  3. This demonstrates the request's construction and execution. Here, using the RestTemplate API, we construct an HTTP request and execute it right after that. Note that the response will be automatically mapped to the Java object and returned as the result of the execution. The type of response body is defined by the second parameter of the getForObject method. Furthermore, the getXxxXxxxxx prefix means that the HTTP method, in that case, is GET.
  4. These are the additional actions that are skipped in the previous example.
  5. This is the execution of another processing stage.

In the preceding example, we defined the request handler which will be invoked on users' requests. In turn, each invocation of the handler produces an additional HTTP call to an external service and then subsequently executes another processing stage. Despite the fact that the preceding code may look familiar and transparent in terms of logic, it has some flaws. To understand what is wrong in this example, let's take an overview of the following request's timeline:

Diagram 1.1. Components interaction timeline

This diagram depicts the actual behavior of the corresponding code. As we may notice, only a small part of the processing time is allocated for effective CPU usage whereas the rest of the time thread is being blocked by the I/O and cannot be used for handling other requests.

In some languages, such as C#, Go, and Kotlin, the same code might be non-blocking when green threads are used. However, in pure Java, we do not have such features yet. Consequently, the actual thread will be blocked in such cases.

On the other hand, in the Java world, we have thread pools, which may allocate additional threads to increase parallel processing. However, under a high load, such a technique may be extremely inefficient to process the new I/O task simultaneously. We will revisit this problem again during this chapter and also analyze it thoroughly in Chapter 6, WebFlux Async Non-Blocking Communication.

Nonetheless, we can agree that to achieve better resource utilization in I/O cases, we should use an asynchronous and non-blocking interaction model. In real life, this kind of communication is messaging. When we get a message (SMS, or email), all our time is taken up by reading and responding. Moreover, we do not usually wait for the answer and work on other tasks in the meantime. Unmistakably, in that case, work is optimized and the rest of the time may be utilized efficiently. Take a look at the following diagram:

To learn more about terminology see the following links:
Diagram 1.2. Non-blocking message communication

In general, to achieve efficient resource utilization when communicating between services in a distributed system, we have to embrace the message-driven communication principle. The overall interaction between services may be described as follows—each element awaits the arrival of messages and reacts to them, otherwise lying dormant, and vice versa, a component should be able to send a message in the non-blocking fashion. Moreover, such an approach to communication improves system scalability by enabling location transparency. When we send an email to the recipient, we care about the correctness of the destination address. Then the mail server takes care of delivering that email to one of the available devices of the recipient. This frees us from concerns about the certain device, allowing recipients to use as many devices as they want. Furthermore, it improves failure tolerance since the failure of one of the devices does not prevent recipients from reading an email from another device.

One of the ways to achieve message-driven communication is by employing a message broker. In that case, by monitoring the message queue, the system is able to control the load management and elasticity. Moreover, the message communication gives clear flow control and simplifies the overall design. We will not get into specific details of this in this chapter, as we will cover the most popular techniques for achieving message-driven communication in Chapter 8, Scaling Up with Cloud Streams.

The phrase lying dormant was taken from the following original document, which aims to emphasize message-driven communication: https://www.reactivemanifesto.org/glossary#Message-Driven.

By embracing all of the previous statements, we will get the foundational principles of the reactive system. This is depicted in the following diagram:

Diagram 1.3. Reactive Manifesto

As we may notice from the diagram, the primary value for any business implemented with a distributed system is responsiveness. Achieving a responsive system means following fundamental techniques such as elasticity and resilience. Finally, one of the fundamental ways to attain a responsive, elastic, and resilient system is by employing message-driven communication. In addition, systems built following such principles are highly maintainable and extensible, since all components in the system are independent and properly isolated.

We will not go all notions defined in the Reactive Manifesto in depth, but it is highly recommended to revisit the glossary provided at the following link: https://www.reactivemanifesto.org/glossary.

All those notions are not new and have already been defined in the Reactive Manifesto, which is the glossary that describes the reactive system's concepts. This manifesto was created to ensure that businesses and developers have the same understanding of conventional notions. To emphasize, a reactive system and the Reactive Manifesto are concerned with architecture, and this may be applied to either large distributed applications or small one-node applications.

The importance of the Reactive Manifesto (https://www.reactivemanifesto.org) is explained by Jonas Bonér, the Founder and CTO of Lightbend, at the following link: https://www.lightbend.com/blog/why_do_we_need_a_reactive_manifesto%3F.