System DesignFebruary 27, 2026·12 min read

Design Twitter: A Step-by-Step System Design Walkthrough

The complete guide to acing the most common system design interview question

Why Design Twitter?

Twitter (now X) is the most commonly asked system design question in tech interviews. It tests your understanding of:

Data modeling: Users, tweets, follows, likes
Scale: 500M tweets/day, 300M users
Trade-offs: Consistency vs latency for feeds
Real-time systems: Notifications, trending

This guide walks through a complete interview answer, including the specific questions to ask, calculations to make, and architectures to draw.

Step 1: Clarify Requirements (5 minutes)

Never start designing immediately. Ask questions to scope the problem and show structured thinking.

Functional Requirements

Ask the interviewer:
"What features should I focus on?"

Typical scope for 45-minute interview:
- Post tweets (280 chars, optional media)
- Follow/unfollow users
- Home timeline (tweets from followed users)
- User profile (user's own tweets)

Usually OUT of scope (confirm with interviewer):
- Direct messages
- Search
- Trending topics (may be deep-dive)
- Likes/retweets (simple, mention briefly)
- Notifications (may be deep-dive)

Non-Functional Requirements

Ask the interviewer:
"What scale should I design for?"

Typical assumptions:
- 300M monthly active users (MAU)
- 100M daily active users (DAU)
- Users post 0.5 tweets/day on average
- Users view timeline 10 times/day
- 20% of tweets have media (images/video)

Performance requirements:
- Timeline loads < 500ms
- High availability (99.9%)
- Eventual consistency acceptable for social media

Key insight: Read-heavy system (100:1 read-to-write ratio)
This drives our architecture decisions.

Pro Tip

Write these requirements on the whiteboard. Reference them when making decisions: "Since we agreed eventual consistency is acceptable, we can use caching aggressively."

Step 2: Capacity Estimation (5 minutes)

Use back-of-envelope calculations to guide architecture decisions. Round aggressively; interviewers want to see your reasoning, not exact math.

Traffic Estimation

Write Traffic (Tweets):
- 100M DAU x 0.5 tweets/day = 50M tweets/day
- 50M / 86,400 sec = ~600 tweets/second
- Peak (3x average) = ~1,800 tweets/second

Read Traffic (Timeline Views):
- 100M DAU x 10 views/day = 1B timeline views/day
- 1B / 86,400 = ~12,000 reads/second
- Peak = ~36,000 reads/second

Ratio: 36,000 reads / 1,800 writes = 20:1 read-heavy
Implication: Optimize for reads (caching, pre-computation)

Storage Estimation

Tweet Storage:
- 50M tweets/day
- Tweet size: ~500 bytes (text + metadata)
- 50M x 500 bytes = 25 GB/day text data
- Per year: 25 GB x 365 = ~9 TB/year

Media Storage:
- 20% of tweets have media
- 10M media/day x 1MB average = 10 TB/day
- Per year: 10 TB x 365 = 3.6 PB/year

Key insight: Media dominates storage.
Solution: Store media in blob storage (S3), only store URLs in database.

Memory Estimation (for Caching)

What to cache: Most recent timeline for active users
- Cache top 100 tweets per user timeline
- 100M users x (100 tweets x 500 bytes) = 5 TB

This is large but feasible with distributed cache (Redis cluster).
In practice: cache only for active users, evict inactive.
Target: ~500GB - 1TB cache cluster.

Step 3: High-Level Design (10 minutes)

Draw the architecture progressively. Start simple, add complexity.

Core Components

┌──────────────────────────────────────────────────────────────────┐
│                            CLIENTS                                │
│              (Web, iOS, Android)                                  │
└──────────────────────┬───────────────────────────────────────────┘
                       │
                       ▼
┌──────────────────────────────────────────────────────────────────┐
│                         CDN                                       │
│              (Static assets, cached media)                        │
└──────────────────────┬───────────────────────────────────────────┘
                       │
                       ▼
┌──────────────────────────────────────────────────────────────────┐
│                    LOAD BALANCER                                  │
│              (Route requests, health checks)                      │
└─────────┬──────────────┬──────────────┬──────────────────────────┘
          │              │              │
          ▼              ▼              ▼
     ┌─────────┐   ┌─────────┐   ┌─────────┐
     │  API    │   │  API    │   │  API    │
     │ Server  │   │ Server  │   │ Server  │
     └────┬────┘   └────┬────┘   └────┬────┘
          │              │              │
          └──────────────┼──────────────┘
                         │
          ┌──────────────┴──────────────┐
          │                             │
          ▼                             ▼
    ┌───────────┐               ┌───────────────┐
    │   Redis   │               │   Databases   │
    │   Cache   │               │  (See below)  │
    └───────────┘               └───────────────┘

Database Architecture

Use different databases for different purposes (polyglot persistence):

┌─────────────────────────────────────────────────────────────────┐
│                        DATA LAYER                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────────┐    ┌──────────────────┐                   │
│  │    PostgreSQL    │    │    Cassandra     │                   │
│  │                  │    │                  │                   │
│  │  - Users table   │    │  - Tweets table  │                   │
│  │  - Follows table │    │  - Timeline table│                   │
│  │  (Strong ACID)   │    │  (High write     │                   │
│  │                  │    │   throughput)    │                   │
│  └──────────────────┘    └──────────────────┘                   │
│                                                                  │
│  ┌──────────────────┐    ┌──────────────────┐                   │
│  │       S3         │    │      Redis       │                   │
│  │                  │    │                  │                   │
│  │  - Media files   │    │  - Timeline cache│                   │
│  │  - Images/video  │    │  - User cache    │                   │
│  │  (Blob storage)  │    │  - Rate limiting │                   │
│  └──────────────────┘    └──────────────────┘                   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Why this split:
- PostgreSQL: Users need ACID (unique usernames, email verification)
- Cassandra: Tweets are append-heavy, need horizontal scaling
- S3: Media is large, immutable, needs CDN integration
- Redis: Hot data, sub-millisecond reads for timelines

Step 4: API Design (5 minutes)

Define the key endpoints. Use RESTful conventions.

// ===== TWEET OPERATIONS =====

POST /api/v1/tweets
Request:
{
  "text": "Hello world!",
  "media_ids": ["uuid1", "uuid2"]  // Pre-uploaded to S3
}
Response:
{
  "id": "tweet_123abc",
  "created_at": "2026-02-27T10:30:00Z",
  "user": { "id": "user_456", "username": "alice" }
}

// ===== TIMELINE =====

GET /api/v1/timeline?cursor={cursor}&limit=20
Response:
{
  "tweets": [
    {
      "id": "tweet_123",
      "text": "Hello world!",
      "user": { "id": "user_456", "username": "alice", "avatar_url": "..." },
      "created_at": "2026-02-27T10:30:00Z",
      "media_urls": ["https://cdn.example.com/..."],
      "like_count": 42,
      "retweet_count": 10
    }
  ],
  "next_cursor": "base64_encoded_timestamp"
}

// ===== FOLLOW =====

POST /api/v1/users/{user_id}/follow
DELETE /api/v1/users/{user_id}/follow

// ===== MEDIA UPLOAD =====

POST /api/v1/media/upload
Returns: pre-signed S3 URL for direct upload
After upload: returns media_id to attach to tweet

Key Design Decisions

Cursor pagination: Better than offset for real-time feeds (handles new tweets during pagination)
Pre-signed URLs: Client uploads directly to S3, reduces server load
Denormalized user data: Include username in tweet response (no extra lookup)
Rate limiting: 300 tweets/hour, 1000 API calls/15min (mention in headers)

Step 5: Deep Dive - Timeline Generation

This is the most interesting part of Twitter's architecture and where interviewers spend the most time. The core question: When a user posts a tweet, how do their followers see it?

Approach 1: Fan-out on Write (Push Model)

When user posts tweet:
1. Write tweet to Tweets table
2. Query all followers (could be millions)
3. Insert tweet_id into each follower's timeline

User posts → Write to 10,000 follower timelines

┌──────────┐
│  Tweet   │
│  Posted  │
└────┬─────┘
     │
     ▼
┌──────────────────────────────────────────┐
│           Fan-out Worker                  │
│  For each follower: INSERT into timeline │
└───┬───────────────────────────────────────┘
    │
    ├──► Timeline_user_1: [tweet_id, ...]
    ├──► Timeline_user_2: [tweet_id, ...]
    ├──► Timeline_user_3: [tweet_id, ...]
    └──► ... (thousands of writes)

Pros:
+ Timeline read is fast (already pre-computed)
+ Simple read path: SELECT from user's timeline

Cons:
- Celebrity with 10M followers = 10M writes per tweet
- High write amplification
- Delay in tweet appearing for followers

Approach 2: Fan-out on Read (Pull Model)

When user views timeline:
1. Query list of followed users
2. For each followed user, fetch recent tweets
3. Merge and sort by timestamp
4. Return top N tweets

User requests timeline → Query N users' tweets → Merge

┌──────────┐
│  Read    │
│ Timeline │
└────┬─────┘
     │
     ▼
┌─────────────────────────────────────────────┐
│        Get followed users (100 users)       │
└────┬────────────────────────────────────────┘
     │
     ├──► Fetch tweets from user_1
     ├──► Fetch tweets from user_2
     ├──► Fetch tweets from user_3
     └──► ... (100 queries, parallelized)
     │
     ▼
┌──────────────────────────────────────────────┐
│   Merge all tweets, sort by time, return 20 │
└──────────────────────────────────────────────┘

Pros:
+ Tweet post is fast (1 write)
+ No write amplification
+ Fresh data always

Cons:
- Read is slow (N queries + merge)
- Complex read path
- Latency increases with following count

The Hybrid Approach (What Twitter Actually Uses)

Insight: 99% of users have < 10K followers. 1% are celebrities.

Strategy:
- Regular users (< 10K followers): Fan-out on WRITE
  → Pre-compute timelines, fast reads

- Celebrities (> 10K followers): Fan-out on READ
  → Don't fan-out, merge at read time

At read time:
1. Fetch pre-computed timeline from cache
2. Query tweets from followed celebrities
3. Merge both, sort by time
4. Return combined feed

┌──────────────────────────────────────────────────────────────────┐
│                      TIMELINE READ                                │
├──────────────────────────────────────────────────────────────────┤
│                                                                   │
│    ┌─────────────────┐     ┌─────────────────────────────────┐   │
│    │  Pre-computed   │     │    Celebrity tweets (fetched    │   │
│    │  timeline from  │  +  │    on-demand from their feeds)  │   │
│    │  Redis cache    │     │    Elon, Taylor, etc.           │   │
│    └────────┬────────┘     └──────────────┬──────────────────┘   │
│             │                             │                       │
│             └──────────────┬──────────────┘                       │
│                            │                                      │
│                            ▼                                      │
│                  ┌───────────────────┐                           │
│                  │   Merge & Sort    │                           │
│                  │   by timestamp    │                           │
│                  └───────────────────┘                           │
│                            │                                      │
│                            ▼                                      │
│                  ┌───────────────────┐                           │
│                  │  Return top 20    │                           │
│                  │  tweets to user   │                           │
│                  └───────────────────┘                           │
│                                                                   │
└──────────────────────────────────────────────────────────────────┘

Why this works:
- Most timelines = cache read + merge ~3 celebrity feeds
- Cache hit = 10ms, merge = 20ms, total < 50ms
- Celebrity posting: no fan-out delay

Interview Insight

Mention that the threshold (10K followers) is tunable based on monitoring. Twitter has adjusted this over time. Show you understand it's a pragmatic engineering decision, not a fixed rule.

Step 6: Deep Dive - Database Schema

Users Table (PostgreSQL)

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    username VARCHAR(15) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    display_name VARCHAR(50),
    bio TEXT,
    avatar_url TEXT,
    followers_count INT DEFAULT 0,
    following_count INT DEFAULT 0,
    is_celebrity BOOLEAN DEFAULT FALSE,  -- For hybrid fan-out
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_users_username ON users(username);

-- Why PostgreSQL: Strong consistency for unique username check,
-- ACID for account operations, complex queries for user search

Follows Table (PostgreSQL)

CREATE TABLE follows (
    follower_id UUID REFERENCES users(id),
    followee_id UUID REFERENCES users(id),
    created_at TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id)
);

-- Get who I follow (for timeline generation)
CREATE INDEX idx_follows_follower ON follows(follower_id);

-- Get my followers (for fan-out on write)
CREATE INDEX idx_follows_followee ON follows(followee_id);

-- Trigger to update follower/following counts
-- (or handle in application layer)

Tweets Table (Cassandra)

// Cassandra schema - optimized for write throughput
// Partition key: user_id (all tweets by user together)
// Clustering key: created_at DESC (recent tweets first)

CREATE TABLE tweets (
    user_id UUID,
    tweet_id TIMEUUID,       -- Time-based UUID, auto-sorted
    text TEXT,
    media_urls LIST<TEXT>,
    like_count COUNTER,
    retweet_count COUNTER,
    created_at TIMESTAMP,
    PRIMARY KEY (user_id, tweet_id)
) WITH CLUSTERING ORDER BY (tweet_id DESC);

// Query: Get recent tweets by user (for profile page)
SELECT * FROM tweets WHERE user_id = ? LIMIT 20;

// Why Cassandra:
// - 50M tweets/day needs horizontal scaling
// - Time-series pattern (recent tweets = hot data)
// - Eventual consistency acceptable
// - Counters for like/retweet counts

Timeline Table (Cassandra)

// Pre-computed timelines from fan-out on write
CREATE TABLE timeline (
    user_id UUID,            -- Owner of this timeline
    tweet_id TIMEUUID,       -- Tweet in their feed
    author_id UUID,          -- Who posted it
    PRIMARY KEY (user_id, tweet_id)
) WITH CLUSTERING ORDER BY (tweet_id DESC)
  AND default_time_to_live = 604800;  -- 7 day TTL

// Query: Get timeline for user
SELECT * FROM timeline WHERE user_id = ? LIMIT 20;

// TTL ensures old timeline entries auto-delete
// (user won't scroll back 7 days anyway)

Step 7: Scaling and Reliability

Caching Strategy

Cache Layer (Redis Cluster):

1. Timeline Cache
   - Key: timeline:{user_id}
   - Value: Sorted Set of (tweet_id, timestamp)
   - TTL: 5 minutes (refresh on read)
   - Size: ~50KB per user (1000 tweet IDs)

2. User Cache
   - Key: user:{user_id}
   - Value: JSON of user profile
   - TTL: 1 hour
   - Invalidate on profile update

3. Tweet Cache
   - Key: tweet:{tweet_id}
   - Value: JSON of tweet + embedded user info
   - TTL: 24 hours (tweets are immutable)

Cache-aside pattern:
1. Check cache
2. On miss: query database
3. Store in cache with TTL
4. Return to client

Target: 95%+ cache hit rate for timeline reads

Handling Failures

What could go wrong?

1. Database failure
   - Replication: Primary + 2 replicas per shard
   - Automatic failover with consensus
   - Timeline reads served from cache during outage

2. Cache failure
   - Redis cluster with replicas
   - Graceful degradation: fall back to database
   - Circuit breaker to prevent cascade

3. Celebrity tweet goes viral
   - Rate limit fan-out worker
   - Queue backpressure
   - Prioritize recent followers

4. DDoS attack
   - CDN-level rate limiting (CloudFlare)
   - Application-level rate limiting (Redis)
   - Bot detection and CAPTCHA

Monitoring

Key metrics to track:

Latency:
- Timeline P50, P95, P99 latency
- Tweet post latency
- Fan-out completion time

Throughput:
- Tweets per second
- Timeline reads per second
- Cache hit rate (target: >95%)

Errors:
- Failed tweet posts
- Timeout rate
- 5xx error rate

Infrastructure:
- Database replication lag
- Cache memory usage
- Queue depth for fan-out workers

Alerting thresholds:
- P99 latency > 500ms
- Error rate > 0.1%
- Cache hit rate < 90%
- Replication lag > 10 seconds

Common Interview Questions

How do you handle a tweet from Elon Musk?

Fan-out on read. Don't push to 150M followers. When users load timeline, merge their pre-computed timeline with a real-time query of followed celebrities. Cache celebrity tweets aggressively.

What if a user follows 5,000 accounts?

Their pre-computed timeline still works (receives fan-out from followed non-celebrities). At read time, merge with ~50 celebrity feeds they follow. 50 parallel queries with caching is fast.

How do you implement trending topics?

Stream processing (Kafka + Flink). Count hashtags in 5-minute sliding windows. Apply decay function for recency. Normalize against baseline to avoid always-popular topics. Geographic partitioning for local trends.

How do you handle duplicate tweets (spam)?

Content hash comparison, rate limiting per user, ML-based spam detection on write path. For retweets, store reference to original rather than duplicating content.

Summary Checklist

Ask clarifying questions (scope, scale, priorities)
Calculate: 50M tweets/day, 1B timeline reads/day, 20:1 read-heavy
Draw: CDN, LB, API servers, cache, databases (polyglot)
Explain hybrid fan-out: push for regular users, pull for celebrities
Schema: Users/Follows in PostgreSQL, Tweets/Timeline in Cassandra
Caching: Redis for timelines, 95%+ hit rate target
Handle edge cases: viral tweets, failures, spam
Discuss trade-offs at every decision point

Continue Learning

System Design Interview Framework - The 6-step approach for any system design question
Design URL Shortener - Another classic system design problem
Practice System Design - Test your knowledge with interactive questions