Once at Yipit, a Redis instance (read-intensive) was experiencing some slowdowns, but nobody could understand why. At first, the DevOps team thought that the application's code was making Redis slow, but after some investigation, they found that the issue was due to a periodic backup strategy. Chapter 8, Scaling Redis (Beyond a Single Instance), will cover persistence in depth.
When Redis starts the procedure to create an RDB snapshot or rewrite the AOF file, it creates a child process (using the fork() system call), and the new process handles the procedure.
During the fork() execution, the process is blocked and Redis will stop serving clients. This is when the perceived latency by clients increases.
The Yipit problem was due to a long fork() time on AWS. The instance type family used was M2, which is a family of ParaVirtual (PV) machines, as opposed to Hardware-assisted Virtual Machines (HVM). It is known that the fork() system call in a PV machine is...