System design interviews separate mid-level candidates from senior candidates more reliably than any other interview type. They reward engineers who have built production systems and expose engineers who have only read about them. A structured framework is not about memorizing designs. It is about running the interview in a way that demonstrates the judgment the rubric measures.
This framework covers the six phases of a senior-level system design interview, the specific moves interviewers grade on, and the preparation protocol that produces consistent senior-level performance across varied prompts.
What Interviewers Actually Grade
Structured interview programs at major tech firms use rubrics with five to seven dimensions. The common set:
| Dimension | Positive Signal | Negative Signal |
|---|---|---|
| Requirements clarification | Asked about scale, users, failure modes | Jumped to solution |
| Capacity estimation | Grounded choices in numbers | Designed without calibration |
| Component design | Clear boundaries and interfaces | Monolithic blob |
| Data modeling | Matched storage to access pattern | Wrong storage for workload |
| Tradeoff reasoning | Named alternatives considered | Presented one solution as obvious |
| Reliability | Addressed failures, partitions, load | No failure reasoning |
| Communication | Clear narration, manageable board | Disorganized, hard to follow |
Scoring happens against the rubric, not against whether the design is "right." Multiple correct designs exist for most prompts. What differentiates candidates is how explicitly they demonstrate the rubric dimensions.
"Junior candidates try to get the design right. Senior candidates show me why their design is a reasonable choice among several. Staff candidates show me the system they would not build and why. The scoring dimension is judgment, not correctness." — Alex Xu, author of System Design Interview
The Six-Phase Framework
Allocate time against phases deliberately. A 45-minute interview breaks down approximately as follows:
| Phase | Time | Output |
|---|---|---|
| 1. Requirements clarification | 5 min | Functional and non-functional reqs |
| 2. Capacity estimation | 5 min | QPS, storage, bandwidth numbers |
| 3. High-level design | 10 min | Major components and data flow |
| 4. Data model and APIs | 7 min | Schemas, endpoints, access patterns |
| 5. Deep dive on 1-2 components | 13 min | Scaling, failure, consistency details |
| 6. Wrap up, tradeoffs, extensions | 5 min | What would change at 10x or 100x |
Candidates who spend 15 minutes on requirements and 5 minutes on deep dive fail the rubric on the heaviest-weighted phase.
Phase 1: Requirements Clarification
Every prompt is ambiguous. The first move is to define the functional and non-functional requirements aloud.
Functional examples for a "design Instagram" prompt:
- Users can post photos
- Users can view a feed
- Users can follow other users
- Users can like and comment
Non-functional examples:
- Read-heavy (99:1 read to write ratio typical for social)
- Millions of daily active users
- Low latency on feed load (under 500ms p95)
- Eventual consistency acceptable for feed
- Strong consistency required for follow state
Candidates who skip this phase design against their assumed scope, which rarely matches the interviewer's intent. Explicit confirmation with the interviewer before proceeding is worth the minute it takes.
Phase 2: Capacity Estimation
Back-of-the-envelope math grounds every subsequent decision. The target is order-of-magnitude accuracy, not precision.
Example estimation for 500M daily active users posting photos:
- 500M DAU
- Each posts 0.1 photos per day average = 50M photos/day
- Peak QPS ~2x average = 1150 writes/sec
- Feed reads 10 per user per day = 5B reads/day = 58K reads/sec peak
- Photo size 2MB average = 100TB/day raw
- With 70% compression = 30TB/day
- 5-year storage = 55PB
These numbers shape every subsequent design choice. 58K QPS reads argue for heavy caching. 55PB storage argues for object storage with CDN fronting. Without these numbers, component choices are arbitrary.
Phase 3: High-Level Design
Draw the major components and the flow between them. At this phase, the diagram should have 5 to 10 boxes and the interfaces between them. Typical components for a social feed:
- Client applications
- API gateway / load balancer
- User service
- Post service
- Media upload service
- Feed generation service
- Cache layer
- Database(s)
- Object storage
- CDN
Arrows should show direction of data flow. Explicitly mention read paths and write paths separately when they differ meaningfully.
Phase 4: Data Model and APIs
For the 2 or 3 services most central to the design, sketch the schema and the API.
User service:
User { user_id, username, email, created_at, profile_image_url }
GET /users/{user_id}
POST /users
Post service:
Post { post_id, user_id, media_url, caption, created_at, like_count }
POST /posts
GET /posts/{post_id}
The specific schema is less important than the demonstration of access pattern awareness. Choosing primary keys and indexes based on the read pattern matters.
Phase 5: Deep Dive
The deep dive is where senior candidates separate from mid candidates. Pick one or two components and explore scaling, consistency, partitioning, and failure.
For a feed generation service, the deep dive might cover:
- Pull model vs push model for feed generation
- Hybrid approach for celebrities with millions of followers
- Cache invalidation strategy when users post or follow
- Storage choice for feed cache (Redis vs in-memory LRU)
- Failure modes when cache is cold
The interviewer will often steer the deep dive. Responding to their steer rather than sticking to a prepared path shows adaptability.
Phase 6: Wrap Up
The final five minutes cover extensions, tradeoffs, and what would change at different scales. "At 10x the traffic, I would shard the post store by user ID and replicate the hot partitions. At 100x, I would move to a geo-distributed system with regional writes and asynchronous cross-region replication."
This final phase often produces the biggest score movement because it demonstrates forward-looking judgment.
"The first ten minutes of a system design interview usually determine the outcome. Candidates who run the clarification and estimation phases well almost never fail, regardless of the specific design they land on." — Donne Martin, author of System Design Primer
Common Prompts and Typical Structures
Different prompts emphasize different tradeoffs. The framework stays constant, but the deep-dive topics shift.
| Prompt | Read vs Write | Key Tradeoff | Typical Deep Dive |
|---|---|---|---|
| Design Twitter / X | Read-heavy | Fanout on read vs write | Feed generation strategy |
| Design URL shortener | Balanced | Hash collision, uniqueness | ID generation, cache layer |
| Design rate limiter | Write-heavy | Accuracy vs performance | Token bucket vs sliding window |
| Design messaging app | Write-heavy | Delivery guarantees | End-to-end encryption, offline queue |
| Design search autocomplete | Read-heavy | Latency under 100ms | Trie structure, edge caching |
| Design video streaming | Read-heavy | Bandwidth, adaptive bitrate | CDN strategy, chunked encoding |
| Design ride-sharing | Balanced | Geo-queries, matching | Quadtree, driver-rider matching |
| Design distributed cache | Write-heavy | Consistency, eviction | Consistent hashing, replication |
Candidates who have practiced one prompt deeply often struggle on an unfamiliar prompt. Practicing 8 to 10 diverse prompts builds the pattern recognition that transfers across variations.
Data Storage Decision Tree
One of the highest-scoring moves in a system design interview is matching the storage to the workload without hesitation. The framework:
- Access pattern: point lookup, range scan, analytical query, graph traversal
- Consistency requirement: strong, eventual, causal, bounded staleness
- Scale dimension: read QPS, write QPS, total storage, regional spread
- Cost sensitivity: hot data vs cold data, duration of retention
Common mappings:
| Workload | Storage Choice |
|---|---|
| User profiles, point lookups | Key-value store (DynamoDB, Cassandra) |
| Relational transactions | RDBMS (Postgres, MySQL) |
| Analytics over large tables | Column store (BigQuery, Redshift) |
| Full text search | Search engine (Elasticsearch, Solr) |
| Social graph | Graph DB (Neo4j, JanusGraph) or adjacency table |
| Time series | TSDB (InfluxDB, Prometheus) |
| Session state, rate limiting | In-memory (Redis, Memcached) |
| Media files, backups | Object storage (S3, GCS) |
Naming the store and the reason produces better rubric scores than naming the store alone. "I would use Cassandra for the post store because the access pattern is write-heavy and partition-friendly on user_id, with no need for joins at read time."
Reliability and Failure Reasoning
Senior and above interviews expect explicit reasoning about failure. The common failure vectors:
- Machine failures (node crash, disk failure)
- Network partitions (split brain, cross-region latency)
- Hot partitions (celebrity users, viral content)
- Cascading failures (retry storms, thundering herd)
- Data corruption (partial writes, checksum mismatch)
For each component in the design, name the failure mode and the mitigation. Replication, circuit breakers, backpressure, bulkheading, and idempotency are the usual tools. Naming the pattern and the specific scenario it prevents produces higher scores than generic "we would add replication."
"The failure discussion separates engineers who have operated systems from those who have only designed them in isolation. Everyone can draw a nice diagram. Fewer can explain why the diagram does not fall over in production." — Nathan Marz, creator of Apache Storm
Preparation Protocol
A 6-to-8 week preparation plan produces consistent senior-level performance.
| Week | Focus | Deliverable |
|---|---|---|
| 1 | Framework memorization, phase timing | 2 timed practice prompts |
| 2 | Capacity estimation practice | 20 order-of-magnitude problems |
| 3 | Storage decision tree, data modeling | 5 schema design exercises |
| 4 | Deep dive on scaling patterns | Caching, sharding, replication notes |
| 5 | Failure reasoning, SRE patterns | 3 post-mortem reads |
| 6 | Mock interviews with senior engineers | 3 full 60-minute mocks |
| 7 | Weak-area targeted practice | Focused prompts on weak phases |
| 8 | Final mocks, decision simplification | Go/no-go readiness check |
Recording mock interviews produces substantial gains. Candidates can identify pacing issues, missed phases, and weak narration only through playback.
The cognitive load of running six phases while drawing, narrating, and responding to interviewer questions is substantial. Working memory and executive function both matter. The cognitive demands of technical interviews at What's Your IQ frame how multitasking under pressure affects performance, which informs how candidates should pace themselves during the interview.
Integrating With Full Loop Preparation
System design interviews sit within a full loop that includes coding and behavioral rounds. Preparation time should balance across all three.
The behavioral portion at senior and above levels often asks about technical decisions made. The STAR method framework at Pass4Sure covers structured answers for behavioral rounds, many of which reference the same systems candidates discuss in system design.
Career positioning for senior and above requires specific scope signals. The IT career roadmap at Pass4Sure covers how scope expectations change at each level and what resume patterns work at senior versus staff.
Salary negotiation after system design interviews that produce offers tends to hinge on the level assigned. Pushing up one level at offer time can add $50,000 to $150,000 in total compensation. The six-figure tech salary negotiation playbook at Pass4Sure covers the specific moves that produce level upgrades.
Study Environment and Retention
System design material is dense and technical. Long uninterrupted study blocks produce better retention than fragmented sessions. The deep-work study environments profiled at Down Under Cafe describe settings that support the multi-hour focus sessions system design preparation requires.
For retention across 8 weeks of preparation, spaced repetition helps. Concepts like CAP theorem tradeoffs, consistency models, and storage choice rationale benefit from interval-based review. The spaced-repetition protocols at When Notes Fly cover the cadence that preserves knowledge across a multi-week preparation window.
Writing capability also correlates with interview performance. Engineers who can write clearly about systems typically explain them clearly on a whiteboard. The technical writing templates at Evolang include structures for design documents that transfer well to interview narration.
The Resume and the Loop
System design interviews often probe specific projects from the resume. "You mentioned you built a real-time analytics pipeline. Walk me through that system." Candidates whose resumes accurately describe designed systems produce smoother interviews because the first prompt is effectively a warm-up on familiar territory.
For candidates moving into consulting after strong interview performance, entity formation matters for tax and liability reasons. The business formation guides at Corpy cover the options for independent professionals working in technical consulting.
Portfolio links and credential verification during the loop have become more common. Shareable QR-encoded verification links for certifications and published work allow recruiters to validate quickly. The QR code generation options at QR Bar Code produce scannable credential links suitable for portfolios and resumes.
Anti-Patterns That Lower Scores
Specific patterns that consistently damage scoring:
- Jumping to component names without requirements
- Picking a database without naming the access pattern
- Using "we" throughout, which reads as project regurgitation rather than designed-now thinking
- Pre-packaged designs that appear memorized
- Silent thinking without narration
- Running out of time before deep dive
- Defending the first idea against interviewer pushback
- Over-engineering for scale the requirements did not demand
Candidates who avoid these patterns and run the six-phase framework cleanly consistently score at senior or above, even when the specific design they land on differs from what the interviewer had in mind.
References
Xu, Alex. System Design Interview: An Insider's Guide. Byte Code LLC, 2020. ISBN: 979-8664653403.
Kleppmann, Martin. Designing Data-Intensive Applications. O'Reilly Media, 2017. ISBN: 978-1449373320.
Gilbert, Seth, and Nancy Lynch. "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services." ACM SIGACT News, vol. 33, no. 2, 2002, pp. 51-59. DOI: 10.1145/564585.564601.
DeCandia, Giuseppe, et al. "Dynamo: Amazon's highly available key-value store." ACM SIGOPS Operating Systems Review, vol. 41, no. 6, 2007, pp. 205-220. DOI: 10.1145/1323293.1294281.
Corbett, James C., et al. "Spanner: Google's globally distributed database." ACM Transactions on Computer Systems, vol. 31, no. 3, 2013, pp. 1-22. DOI: 10.1145/2491245.
Lakshman, Avinash, and Prashant Malik. "Cassandra: a decentralized structured storage system." ACM SIGOPS Operating Systems Review, vol. 44, no. 2, 2010, pp. 35-40. DOI: 10.1145/1773912.1773922.
Martin, Donne. The System Design Primer. GitHub, 2024. https://github.com/donnemartin/system-design-primer
Beyer, Betsy, et al. Site Reliability Engineering: How Google Runs Production Systems. O'Reilly Media, 2016. ISBN: 978-1491929124.
