Now that we have all of the tools to communicate between disparate servers, we can start building a very rudimentary API to generate ID values that are distinct across a pool of database servers. By doing so, database-level function calls are available to the application and encourage data distribution, otherwise known as application-level sharding. This, in turn, increases our scalability and availability, as it will take far more than a single database outage to truly derail the application.
A company that did this early in the development cycle of their platform is Instagram. In fact, they're very open about the process they used, as described in this blog post:
The idea they implemented may seem complicated but is actually deceptively simple. Here's a basic breakdown of what they were trying to create:
- The system should accommodate several thousand logical shards