System DesignMarch 3, 2026·14 min read

Design Instagram: A Step-by-Step System Design Walkthrough

How to design a photo-sharing social network that scales to 2 billion users

Why Design Instagram?

Instagram is one of the most frequently asked system design questions. It tests your understanding of:

Media storage & delivery: Photo upload, processing, and CDN distribution
Feed generation: Algorithmic ranking vs chronological ordering
Scale: 2B+ monthly active users, 100M+ photos uploaded daily
Real-time features: Stories, live video, notifications

This guide walks through a complete 45-minute interview answer, from requirements gathering through deep dives into feed generation and media pipelines.

Step 1: Clarify Requirements (5 minutes)

Functional Requirements

Ask the interviewer:
"What features should I focus on?"

Typical scope for 45-minute interview:
- Upload photos/videos with captions
- Follow/unfollow users
- News feed (photos from followed users)
- Like and comment on posts
- User profile page

Usually OUT of scope (confirm with interviewer):
- Stories (ephemeral content)
- Direct messaging
- Reels / short-form video
- Explore / discovery page
- Ads system

Non-Functional Requirements

Scale assumptions:
- 2B monthly active users (MAU)
- 500M daily active users (DAU)
- Average user uploads 1 photo every 3 days
- Average user views feed 5 times/day
- Each feed load shows 20 posts

Traffic:
- Uploads: 500M / 3 = ~170M photos/day = ~2,000/sec
- Feed reads: 500M x 5 = 2.5B/day = ~29,000/sec

Key insight: Read-heavy system (~15:1 ratio)
Feed generation is the hard part.

Step 2: Capacity Estimation (5 minutes)

Storage

Photo Storage:
- 170M photos/day
- Average photo: 2MB (after compression)
- Multiple resolutions stored: original + 4 sizes = ~5MB total
- 170M x 5MB = 850 TB/day
- Per year: 850 TB x 365 = ~310 PB/year

Metadata Storage:
- Post record: ~500 bytes (caption, timestamp, user_id, location)
- 170M x 500 bytes = 85 GB/day metadata
- Per year: ~31 TB metadata

Key insight: Photo storage dominates everything.
Use object storage (S3) for photos, databases only for metadata.

Bandwidth

Upload bandwidth:
- 2,000 photos/sec x 2MB = 4 GB/sec ingress

Download bandwidth:
- 29,000 feed loads/sec x 20 photos x 200KB (compressed) = 116 GB/sec
- This is massive - CDN is essential

Cache needs:
- Hot user metadata and recent posts
- ~500GB-1TB Redis cluster for feed caches

Step 3: High-Level Architecture

┌─────────────────────────────────────────────────────────────┐
│                        CLIENTS                               │
│              (iOS, Android, Web)                              │
└───────────────────────┬─────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────────┐
│                      CDN (CloudFront)                        │
│           Serves photos, videos, static assets               │
└───────────────────────┬─────────────────────────────────────┘
                        │
                        ▼
┌─────────────────────────────────────────────────────────────┐
│                    API GATEWAY / LB                           │
│            Authentication, rate limiting, routing             │
└────────┬──────────────┬──────────────┬──────────────────────┘
         │              │              │
         ▼              ▼              ▼
   ┌──────────┐  ┌──────────┐  ┌──────────┐
   │  Upload  │  │  Feed    │  │  Social  │
   │ Service  │  │ Service  │  │ Service  │
   └────┬─────┘  └────┬─────┘  └────┬─────┘
        │              │              │
        ▼              ▼              ▼
  ┌──────────┐  ┌──────────┐  ┌──────────┐
  │   S3     │  │  Redis   │  │ Postgres │
  │  (media) │  │ (feeds)  │  │ (social) │
  └──────────┘  └──────────┘  └──────────┘

Service Responsibilities

1. Upload Service
   - Accept photo upload (pre-signed S3 URL)
   - Generate multiple resolutions (thumbnail, medium, full)
   - Extract EXIF data, apply filters
   - Store metadata in database
   - Trigger feed fan-out

2. Feed Service
   - Generate personalized feed for each user
   - Hybrid approach: pre-computed + on-read for celebrities
   - Rank posts by engagement, recency, relationship

3. Social Service
   - Follow/unfollow operations
   - Like, comment, save operations
   - Notification triggers
   - Activity feed

Step 4: Database Design

-- Users table (PostgreSQL - needs ACID)
CREATE TABLE users (
  id          BIGINT PRIMARY KEY,
  username    VARCHAR(30) UNIQUE NOT NULL,
  email       VARCHAR(255) UNIQUE NOT NULL,
  bio         TEXT,
  avatar_url  VARCHAR(500),
  created_at  TIMESTAMP DEFAULT NOW()
);

-- Posts table (PostgreSQL, partitioned by created_at)
CREATE TABLE posts (
  id          BIGINT PRIMARY KEY,
  user_id     BIGINT REFERENCES users(id),
  caption     TEXT,
  media_url   VARCHAR(500) NOT NULL,
  media_type  VARCHAR(10), -- 'photo' | 'video'
  location    POINT,
  created_at  TIMESTAMP DEFAULT NOW()
);

-- Follows (PostgreSQL)
CREATE TABLE follows (
  follower_id  BIGINT REFERENCES users(id),
  followee_id  BIGINT REFERENCES users(id),
  created_at   TIMESTAMP DEFAULT NOW(),
  PRIMARY KEY (follower_id, followee_id)
);
-- Index: (followee_id, follower_id) for "who follows me"

-- Feed cache (Redis sorted set)
-- Key: feed:{user_id}
-- Score: post timestamp (for ordering)
-- Value: post_id

Step 5: Deep Dive — Feed Generation

This is the most critical and interesting part. The interviewer will spend the most time here.

Hybrid Fan-Out Approach

Same insight as Twitter: most users have few followers.

Regular users (< 50K followers): Fan-out on WRITE
- When they post, push post_id to all follower feed caches
- Fast reads: just fetch from cache

Celebrities (> 50K followers): Fan-out on READ
- Don't fan-out on write (too expensive)
- At read time, merge celebrity posts into cached feed

Feed Read Flow:
1. Fetch pre-computed feed from Redis (posts from regular users)
2. Fetch recent posts from followed celebrities
3. Merge both lists
4. Apply ranking algorithm
5. Return top 20 posts with full metadata

Feed Write Flow (regular user posts):
1. Post saved to database
2. Fan-out worker fetches follower list
3. For each follower: ZADD feed:{follower_id} {timestamp} {post_id}
4. Trim feed to last 500 entries (ZREMRANGEBYRANK)

Feed Ranking

Instagram doesn't show purely chronological feeds.

Ranking signals:
- Recency (newer posts score higher)
- Relationship (users you interact with most)
- Engagement (posts with many likes/comments)
- Content type preference (photos vs videos)
- Session context (avoid showing same creator twice in a row)

Simplified scoring:
score = (recency_weight * recency_score)
      + (relationship_weight * interaction_score)
      + (engagement_weight * normalized_engagement)

In practice: ML model trained on user engagement data.
For the interview: mention the signals, explain trade-offs.

Step 6: Media Upload Pipeline

Photo upload is async and multi-stage:

1. Client requests pre-signed S3 URL
2. Client uploads directly to S3 (bypass API servers)
3. S3 triggers Lambda/worker via event notification
4. Worker processes photo:
   a. Validate image (format, size, content policy)
   b. Strip EXIF data (privacy)
   c. Generate resolutions:
      - Thumbnail: 150x150
      - Small: 320xN
      - Medium: 640xN
      - Large: 1080xN
      - Original: preserved
   d. Upload all versions to S3
   e. Create CDN invalidation
   f. Update post metadata with media URLs
   g. Trigger feed fan-out

Total time: 2-5 seconds from upload to appearing in feeds.

Why pre-signed URLs?
- API servers don't handle large file transfers
- S3 handles upload directly (scales independently)
- Reduces API server CPU and bandwidth

Step 7: Scaling Considerations

Read Path Optimization

Multi-layer caching:
1. CDN: Serves all photos (cache-control headers)
2. Application cache (Redis): Feed post IDs + user metadata
3. Database read replicas: For cache misses

Cache strategy:
- Feed cache: Write-through on fan-out
- User metadata: Cache-aside with 5min TTL
- Post metadata: Cache-aside with 1hr TTL
- Photo URLs: Immutable, cache forever at CDN

Database Sharding

Shard by user_id:
- Posts table: shard by post author's user_id
- Follows table: shard by follower_id (optimizes "who do I follow")
- Feed cache: shard by feed owner's user_id

Why user_id?
- Feed reads access one user's data → single shard
- User profile page → single shard
- Cross-shard queries rare (search, explore)

Alternative: shard posts by post_id
- Better write distribution
- But feed generation requires scatter-gather across shards
- Worse for this use case

Common Interview Follow-ups

How would you handle Stories?

Stories differ from posts:
- Ephemeral (24-hour TTL)
- Sequential viewing (not ranked)
- Stored separately from main feed

Implementation:
- Store in Redis with TTL of 24 hours
- Separate "stories feed" sorted by recency
- When user opens app: fetch story list + first frame
- Lazy-load remaining story media

Storage: Much cheaper since content expires.
Use Redis with expiry vs permanent database storage.

How would you build Explore/Discovery?

Explore shows posts from users you DON'T follow.

Two-stage approach:
1. Candidate generation (offline):
   - Collaborative filtering: "users like you liked these"
   - Content-based: similar hashtags, locations, visual features
   - Generate candidate pool of ~10K posts per user

2. Ranking (online):
   - ML model scores each candidate
   - Diversity filter (avoid too many of same type)
   - Return top 50-100 for the session

Infrastructure:
- Offline pipeline generates candidates every few hours
- Online ranker is a lightweight ML model
- Results cached with 1-hour TTL

Key Takeaways for the Interview

Lead with the feed: Feed generation is the core technical challenge — show you understand fan-out trade-offs
Separate media from metadata: S3 for photos, databases for everything else
CDN is critical: Without it, you can't serve 116 GB/sec of photo downloads
Hybrid fan-out: Push for regular users, pull for celebrities — this shows sophistication
Pre-signed URLs: Mention this for uploads to show you understand real-world patterns

Practice This on HireReady

System design questions like this appear in Meta, Google, and Amazon interviews. Practice with our AI voice interviewer for real-time feedback on your communication and structure.