System Design Card 442 — Observability / Assess
Concern
Without metrics, logs, and traces, scale and reliability claims are mostly speculation. Queue lag, error rates, cache hit ratios, and P95 latency often matter more than average-case anecdotes.
What Assess means for this concern
In BASIC, the Assess step is where you identify the main architectural pressures and choose which trade-offs are actually important. For Observability, that means the candidate should make this concern visible at the right moment instead of bolting it on at the end.
Design move
A good move is to compare plausible approaches before committing. Tie the concern back to the user flow, the workload, and the dominant trade-off. That keeps the design grounded and makes it easier for the interviewer to follow why a cache, queue, replica, partition, or rate limiter is actually necessary.
Common miss
The miss is describing a complex system with no plan to detect or localize failure. BASIC helps because the staged flow keeps this concern proportional to the prompt and connected to the rest of the architecture.
BASIC prompt
“When I reach the Assess stage, how does Observability change the architecture, the trade-offs, or the review checklist?”