Book Image

TypeScript Microservices

Book Image

TypeScript Microservices

Overview of this book

In the last few years or so, microservices have achieved the rock star status and right now are one of the most tangible solutions in enterprises to make quick, effective, and scalable applications. The apparent rise of Typescript and long evolution from ES5 to ES6 has seen lots of big companies move to ES6 stack. If you want to learn how to leverage the power of microservices to build robust architecture using reactive programming and Typescript in Node.js, then this book is for you. Typescript Microservices is an end-to-end guide that shows you the implementation of microservices from scratch; right from starting the project to hardening and securing your services. We will begin with a brief introduction to microservices before learning to break your monolith applications into microservices. From here, you will learn reactive programming patterns and how to build APIs for microservices. The next set of topics will take you through the microservice architecture with TypeScript and communication between services. Further, you will learn to test and deploy your TypeScript microservices using the latest tools and implement continuous integration. Finally, you will learn to secure and harden your microservice. By the end of the book, you will be able to build production-ready, scalable, and maintainable microservices using Node.js and Typescript.
Table of Contents (17 chapters)
Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
Index

Microservice design aspects


While designing microservices, various important decisions need to be taken such as how will the microservices communicate with each other, how we will handle security, how we will do data management, and so on. Let's now look at those various aspects involved in the microservices design and understand various options available to it.

Communication between microservices

Let's understand this aspect with a real-world example to understand the problem. In the shopping cart application, we have our product microservices, inventory microservice, check out microservice, and user microservice. Now a user opts to buy a product; for the user, the product should be added to their cart, the amount paid, on successful payment, the checkout done, and inventory updated. Now if payment is successfully done, then only the checkout and inventory should be updated, hence the services need to communicate with each other. Let's now look at some of the mechanisms that microservices can use to communicate with each other or any of the external clients.

Remote Procedure Invocation (RPI)

Briefly speaking, remote procedure call is a protocol that anyone can use to access services from any other providers located remotely in the network, without the need of understanding the network details. The client uses the protocol of request and replies to make requests for services and it is one of the most feasible solutions to REST for big data search systems. It has one of the major advantages of serialization time. Some of the technologies providing RPI are Apache Thrift and Google's gRPC. gRPC is a widely adopted library and it has more than 23,000 downloads from Node.js per day. It has some awesome utilities such as pluggable authentication, tracing, load balancing, and health checking. It is used by Netflix, CoreOS, Cisco, and so on. This pattern of communication has the following advantages:

  • Request and reply are easy
  • Simple to maintain as there is no middle broker
  • Bidirectional streams with HTTP/2-based transportation methods
  • Efficiently connecting polyglot services in microservices styled architectural ecosystems

This pattern has the following challenges and issues for consideration:

  • The caller needs to know the locations of service instances, that is, maintain a client-side registry and server-side registry
  • It only supports the request and reply model and has no support for other patterns such as notifications, async responses, the publish/subscribe pattern, publish async responses, streams, and so on

RPI uses binary rather than text to keep the payload very compact and efficient. These requests are multiplexed over a single TCP connection, which can allow multiple concurrent messages to be in flight without having to compromise for network consumption usage.

Messaging and message bus

This mode of communication is used when services have to handle the requests from various client interfaces. Services need to collaborate with each other to handle some specific operations, for which they need to use an inter-process communication protocol. Asynchronous messaging and message bus is one of them. Microservices communicate with each other by exchanging messages over various messaging channels. Apache Kafka, RabbitMQ, and ActiveMQ, Kestrel are some of the widely available message brokers that can be used for communication between microservices.

The message broker ultimately does the following set of functionalities:

  • Route messages coming from various clients to different microservices destinations.
  • Changes messages to desired transformations as per need.
  • Ability to do message aggregations, segregate a message into multiple messages, and send them to the destination as per need and recompose them.
  • Respond to errors or events.
  • Provide content and routing using the publish-subscribe pattern.
  • Using message bus as a means of communication between microservices has the following advantages:
  • The client is decoupled from the services; they don't need to discover any services. Loosely coupled architecture throughout.
  • Highly available as the message broker persists messages until the consumer is able to process them for operations.
  • It has support for a variety of communication patterns, including the widely used request/reply, notifications, async responses, publish-subscribe, and so on.

While this mode provides several advantages, it increases the complexity of adding a message broker that should be made highly available, as it can become a single point of failure. It also implies the need for the client to discover the location of the message broker, the single point of contact.

Protobufs

Protocol buffers or protobufs are a binary format created by Google. Google defines protobufs as a language and platform neutral extensive way of serializing structured data that can be used as one of the communication protocols. Protobufs also defines a set of some language rules that define the structure of messages. Some demonstrations effectively show that protobufs is six times faster than JSON. It is very easy to implement and it involves three major stages, which are creating message descriptors, message implementations, and parsing and serialization. Using protobufs in your microservices gives you the following advantages:

  • Formats for protobufs are self-explaining—formal formats.
  • It has RPC support; you can declare server RPC interfaces as part of protocol files.
  • It has an option for structure validation. As it has larger datatype messages that are serialized on protobufs, it can be validated automatically by the code that is responsible for exchanging them.

While the protobuf pattern offers various advantages, it has some drawbacks, which are as follows:

  • It is an upcoming pattern; hence you won't find many resources or detailed documentation for implementation of protobuf. If you just look for the protobuf tag on Stack Overflow, you will merely see a mere 10,000 questions.
  • As it's binary format, it's non-readable when compared to JSON, which is simple to read and analyze on the other hand. The next generation of protobuf and flatbuffer is already available now.

Service discovery

The next obvious aspect to take care of is the method through which any client interface or any microservice will discover the network location of any service instance. Modern applications based on microservices run in virtualized or containerized environments where things change dynamically, including the number of instances of services and their location. Also, the set of service instances changes dynamically based on auto-scaling, upgrades, and so on. We need an elaborate a service discovery mechanism. Discussed ahead are widely used patterns.

Service registry for service-service communication

Different microservices and various client interfaces need to know the location of service instances so as to send requests. Usually, virtual machines or containers have a different or dynamic IP address, for example, an EC2 group when applied auto-scaling, it auto adjusts the number of instances based on load. Various options are available to maintain a registry anywhere such as client-side or server-side registrations. Clients or microservices look up to that registry to find other microservices for communication.

Let's take the real-life example of Netflix. Netflix Eureka is a service registry provider. It has various options for registering and querying available service instances. Using the POST API exposed an instance of service tells about its network location. It must be constantly updated every 30 seconds with the PUT API exposed. Any interface can use the GET API to get that instance and use it as per demand. Some of the widely available options are as follows:

  • etcd: A key-value store used for shared configuration and service discovery. Projects such as Kubernates and cloud foundry are based on etcd as it can be highly available, key-value based, and consistent.
  • consul: Yet another tool for service discovery. It has wide options such as exposed API endpoints that allow the client to register and discover services and perform health checks to determine service availability.
  • ZooKeeper: Very widely used, highly available, and a high performant coordinated service used in distributed applications. Originally a subproject of Hadoop, Zookeeper is a widely used top-level project and it comes preconfigured with various frameworks.

Some systems have implicit in-built service registry, built in as a part of their framework. For example, Kubernates, Marathon, and AWS ELB.

Server-side discovery

All requests made to any of the services are routed via a router or load balancers that run in a location known to client interfaces. The router then queries a maintained registry and forwards the request based on the query response. An AWS Elastic load balancer is a classic example that has the ability to handle load balancing, handle internal or external traffic, and act as a service registry. EC2 instances are registered to ELB either via exposed API calls or either through auto-scaling. Other options include NGINX and NGINX Plus. There are available consul templates that ultimately generate the nginx.conf file from the consul service registry and can configure proxying as required.

Some of the major advantages of using server-side discovery are as follows:

  • The client does not need to know the location of different microservices. They just need to know the location of the router and the service discovery logic is completely abstracted from the client so there is zero logic at the client end.
  • Some environments provide this component functionality for free.

While these options have great advantages, there are some drawbacks too that need to be handled:

  • It has more network hops, that is, one from the client service registry and another from the service registry microservice.
  • If the load balancer is not provided by the environment, then it has to be set up and managed. If not properly handled, then it can be a single point of failure.
  • The selected router or load balancer must support different communication protocols for modes of communication.

Client-side discovery

Under this mode of discovery, the client is responsible for handling the network location of available microservices and load balancing incoming requests across them. The client needs to query a service registry (a database of available services maintained on the client side). The client then selects service instances on the basis of an algorithm and then makes a request. Netflix uses this pattern extensively and has open sourced their tools Netflix OSS, Netflix Eureka, Netflix Ribbon, and Netflix Prana. Using this pattern has the following advantages:

  • High performance and availability as there are fewer transition hops, that is, the client just has to invoke the registry and the registry will redirect to the microservice as per their needs.
  • This pattern is fairly simple and highly resilient as besides the service registry there are no moving parts. As the client knows about available microservices, they can make intelligent decisions easily such as to use a hash, when to trigger auto-scaling, and so on.
  • One significant drawback of using this mode of service discovery is implementation of client-side service discovery logic has to be done in every programming language of the framework that is used by the service clients. For example, Java, JavaScript, Scala, Node.js, Ruby, and so on.

Registration patterns – self-registration

While using this pattern, any microservice instance is responsible for registering and deregistering itself from the maintained service registry. To maintain health checks, a service instance sends heartbeat requests to prevent its registry from expiring. Netflix uses a similar approach and has outsourced their Eureka library, which handles all aspects of service registration and deregistration. It has its client in Java as well as Node.js. The Node.js client (eureka-js-client) has more than 12,000 downloads a month. The self-registration pattern has major benefits, such as any microservice instance would know its own state, hence it can implement or shift to other modes easily such as Starting, Available, and others.

However, it also has the following drawbacks:

  • It couples the service tightly to the self-service registry, which forces us to enable the service registration code in each language we are using in the framework
  • Any microservice that is in running mode, but is not able to handle requests, will often be unaware of which state to pursue, and will often end up forgetting to unregister from the registry

Data management

Another important question in microservice design aspect is the database architecture in a microservices application. We will see various options such as whether to maintain a private datastore, managing transactions, and making querying datastores easy in distributed systems. An initial thought can be going with a single database, but if we give it deep thought, we will soon see it as an unwise and unfitting solution because of tight coupling, different requirements, and runtime blocking by any of the services.

Database per service

In a distributed microservices architecture, different services have needs and usages of different storage requirements. The relational database is a perfect choice when it comes to maintaining relations and having complex queries. NoSQL databases such as MongoDB is the best choice when there is unstructured complex data. Some may require graph data and thus use Neo4j or GraphQL. The solution is to keep each of the microservices data private to that service and get it accessible only via APIs. Each microservice maintains its datastore and is a private part of that service implementation and hence it is not directly accessible by other services.

Some of the options you have while implementing this mode of data management are as follows:

  • Private tables/collections per service: Each microservice has a set of defined tables or collections that can only be accessed by that service
  • Schema per service: Each service has a schema that can only be accessed via the microservice it is bound to
  • Database per service: Each microservice maintains its own database as per its needs and requirements

When thought of, maintaining a schema per service seems to be the most logical solution as it will have lower overhead and ownership can clearly be made visible. If some services have high usage and throughput and different usage, then maintaining a separate database is the logical option. A necessary step is to add barriers that will restrict any microservice from accessing data directly. Various options to add this barrier include assigning user IDs with restricted privileges or accessing control mechanisms such as grants. This pattern has the following advantages:

  • Loosely coupled services that can stand on their own; changes to one service's datastore won't affect any other services.
  • Each service has the liberty to select the datastore as required. Each microservice has the option of whether to go for relational or non-relational databases as per need. For example, any service that needs intensive search results on text may go for Solr or Elasticsearch, whereas any service where there is structured data may go for any SQL database.

This pattern has the following drawbacks and upcomings that need to be handled with care:

  • Handling complex scenarios that involve transactions spanning across multiple services. The CAP theorem states that it is impossible to have more than two out of the following three guarantees—consistency, availability, and partitions in the distributed datastore, so transactions are generally avoided.
  • Queries ranging across multiple databases are challenging and resource consuming.
  • The complexity of managing multiple SQL and non-SQL datastores.

To overcome the drawbacks, the following patterns are used while maintaining a database per service:

  • Sagas: A saga is defined as a batch sequence of local transactions. Each entry in the batch updates the specified database and moves on by publishing a message or triggering an event for the next entry in the batch to happen. If any entry in the batch fails locally or any business rule is violated, then the saga executes a series of compensating transactions that compensate or undo the changes that were made by the saga batch updates.
  • API Composition: This pattern insists that the application should perform the join rather than the database. As an example, a service is dedicated to query composition. So, if we want to fetch monthly product distributions, then we first retrieve the products from the product service and then query the distribution service to return the distribution information of the retrieved products.
  • Command Query Responsibility Segregation (CQRS): The principle of this pattern is to have one or more evolving views, which usually have data coming from various services. Fundamentally, it splits the application into two parts—the command or the operating side and the query or the executor side. It is more of a publisher-subscriber pattern where the command side operates create/update/delete requests and emits events whenever the data changes. The executor side listens for those events and handles those queries by maintaining views that are kept up to date, based on the subscription of events that are emitted by the command or operating side.

Sharing concerns

The next big thing in distributed microservice architecture to handle is sharing concerns. How will general things such as API routing, security, logging, and configurations work? Let's look at those points one by one.

Externalized configuration

An application usually uses one or many infrastructures third-party services such as a service registry, message broker, server, cloud deployment platform, and so on. Any service must be able to run in multiple environments without any modifications. It should have the ability to pick up external configurations. This pattern is more of a guideline that advises us to externalize all the configurations, including database information, environment info, network location, and so on, that create a startup service that reads this information and prepares the application accordingly. There are various options available. Node.js provides setting environment variables; if you use Docker, then it has the docker-compose.yml file.

Observability

Revisiting the twelve-factor's required for an application, we observe that any application needs some centralized features, even if it's distributed. These centralized features help us to have proper monitoring and debugging in case of issues. Let's look at some of the common observability parameters to look out for.

Log aggregation

Each service instance will generate information about what it is doing in a standardized format, which contains logs at various levels such as errors, warning, info, debug, trace, fatal, and so on. The solution is to use a centralized logging service that collects logs from each service instance and stores them in some common place where the user can search and analyze the logs. This enables us to configure alerts for certain kinds of logs. Also, a centralized service will help to do audit logging, exception tracking, and API metrics. Available and widely used frameworks are Elastic Stack (Elasticsearch, Logstash, Kibana), AWS CloudTrail, and AWS CloudWatch.

Distributed tracing

The next big problem is to understand the behavior and application so as to troubleshoot problems when required. This pattern is more of a designing guideline that states to maintain a unique external request ID, which is maintained by a microservice. This external request ID needs to be passed to all services that are involved in handling that request and in all the log messages. Another guideline is to include the start time and end time of requests and operations performed when a microservice does the operation.

Based on the preceding design aspects, we will see common microservice design patterns and understand each pattern in depth. We'll see when to use a particular pattern, what the problems are that it solves, and what pitfalls to avoid while using that design pattern.