Rate limiting involves restricting the number of requests that can be made by a client. A client can be identified based on the access token it uses for the request as covered in Chapter 3, Security and Traceability. Another way the client can be identified is the IP address of the client.
To prevent abuse of the server, APIs must enforce throttling or rate-limiting techniques. Based on the client, the rate-limiting application can decide whether to allow the request to go through or not.
The server can decide what the natural rate limit per client should be, say for example, 500 requests per hour. The client makes a request to the server via an API call. The server checks if the request count is within the limit. If the request count is within the limit, the request goes through and the count is increased for the client. If the client request count exceeds the limit, the server can throw a 429 error.
The server can optionally include a Retry-After
header, which indicates...