System Design Card 425 — Replication and Failover / Check
Concern
Replication improves durability and availability, but it changes write paths, read semantics, and operational complexity. Primary-replica databases, leader election, or multi-region copies all solve different failure models.
What Check means for this concern
In BASIC, the Check step is where you review the design for bottlenecks, failure modes, security gaps, observability, and cost. For Replication and Failover, that means the candidate should make this concern visible at the right moment instead of bolting it on at the end.
Design move
A good move is to review and stress-test before you hand the answer over. Tie the concern back to the user flow, the workload, and the dominant trade-off. That keeps the design grounded and makes it easier for the interviewer to follow why a cache, queue, replica, partition, or rate limiter is actually necessary.
Common miss
The miss is saying 'replicate it' without addressing lag, failover timing, or write ownership. BASIC helps because the staged flow keeps this concern proportional to the prompt and connected to the rest of the architecture.
BASIC prompt
“When I reach the Check stage, how does Replication and Failover change the architecture, the trade-offs, or the review checklist?”