Behind the application we are likely to find a database cluster of some sort. For this example, we have chosen RDS (MySQL/PostgreSQL). However, the scaling and resilience ideas can be easily translated to suit a custom DB cluster on EC2 instances.
Starting with high-availability, in terms of RDS, the feature is called a Multi-AZ deployment. This gives us a Primary RDS instance with a hot STANDBY replica as a failover solution. Unfortunately, the Standby cannot be used for anything else, that is to say we cannot have it, for example, serving read-only queries.
A Multi-AZ setup within our VPC would look like this:
In the case of a PRIMARY outage, RDS automatically fails over to the STANDBY, updating relevant DNS records in the process. According to the documentation, a typical failover takes one to two minutes.
The triggers include the Primary becoming unavailable (thus failing AWS health-checks), a complete AZ outage, or a user interruption such as an RDS instance reboot.