In the Challenges with autonomous software components section, we have already seen some of the challenges that autonomous software components can bring (and they all apply to microservices as well) as follows:
- Many small components that use synchronous communication can cause a chain of failure problem, especially under high load.
- Keeping the configuration up to date for many small components can be challenging.
- It's hard to track a request that's being processed and involves many components, for example, when performing root cause analysis, where each component stores log events locally.
- Analyzing the usage of hardware resources on a component level can be challenging as well.
- Manual configuration and management of many small components can become costly and error-prone.
Another downside (but not always obvious initially) of decomposing an application into a group of autonomous components is that they form a distributed system. Distributed systems are known to be, by their nature, very hard to deal with. This has been known for many years (but in many cases neglected until proven differently). My favorite quote to establish this fact is from Peter Deutsch who, back in 1994, stated the following:
The 8 fallacies of distributed computing: Essentially everyone, when they first build a distributed application, makes the following eight assumptions. All prove to be false in the long run and all cause big trouble and painful learning experiences:
- The network is reliable
- Latency is zero
- Bandwidth is infinite
- The network is secure
- Topology doesn't change
- There is one administrator
- Transport cost is zero
- The network is homogeneous
Note: The eighth fallacy was actually added by James Gosling at a later date. For more details, please go to https://www.rgoarchitects.com/Files/fallacies.pdf.
In general, building microservices-based on these false assumptions leads to solutions that are prone to both temporary network glitches and problems that occur in other microservice instances. When the number of microservices in a system landscape increases, the likelihood of problems also goes up. A good rule of thumb is to design your microservice architecture based on the assumption that there is always something going wrong in the system landscape. The microservice architecture needs to be designed to handle this, in terms of detecting problems and restarting failed components but also on the client-side so that requests are not sent to failed microservice instances. When problems are corrected, requests to the previously failing microservice should be resumed; that is, microservice clients need to be resilient. All of these need, of course, to be fully automated. With a large number of microservices, it is not feasible for operators to handle this manually!
The scope of this is large, but we will limit ourselves for now and move on to study design patterns for microservices.