System DesignMarch 3, 2026·14 min read
Design Instagram: A Step-by-Step System Design Walkthrough
How to design a photo-sharing social network that scales to 2 billion users
Why Design Instagram?
Instagram is one of the most frequently asked system design questions. It tests your understanding of:
- Media storage & delivery: Photo upload, processing, and CDN distribution
- Feed generation: Algorithmic ranking vs chronological ordering
- Scale: 2B+ monthly active users, 100M+ photos uploaded daily
- Real-time features: Stories, live video, notifications
This guide walks through a complete 45-minute interview answer, from requirements gathering through deep dives into feed generation and media pipelines.
Step 1: Clarify Requirements (5 minutes)
Functional Requirements
Ask the interviewer:
"What features should I focus on?"
Typical scope for 45-minute interview:
- Upload photos/videos with captions
- Follow/unfollow users
- News feed (photos from followed users)
- Like and comment on posts
- User profile page
Usually OUT of scope (confirm with interviewer):
- Stories (ephemeral content)
- Direct messaging
- Reels / short-form video
- Explore / discovery page
- Ads systemNon-Functional Requirements
Scale assumptions:
- 2B monthly active users (MAU)
- 500M daily active users (DAU)
- Average user uploads 1 photo every 3 days
- Average user views feed 5 times/day
- Each feed load shows 20 posts
Traffic:
- Uploads: 500M / 3 = ~170M photos/day = ~2,000/sec
- Feed reads: 500M x 5 = 2.5B/day = ~29,000/sec
Key insight: Read-heavy system (~15:1 ratio)
Feed generation is the hard part.Step 2: Capacity Estimation (5 minutes)
Storage
Photo Storage:
- 170M photos/day
- Average photo: 2MB (after compression)
- Multiple resolutions stored: original + 4 sizes = ~5MB total
- 170M x 5MB = 850 TB/day
- Per year: 850 TB x 365 = ~310 PB/year
Metadata Storage:
- Post record: ~500 bytes (caption, timestamp, user_id, location)
- 170M x 500 bytes = 85 GB/day metadata
- Per year: ~31 TB metadata
Key insight: Photo storage dominates everything.
Use object storage (S3) for photos, databases only for metadata.Bandwidth
Upload bandwidth:
- 2,000 photos/sec x 2MB = 4 GB/sec ingress
Download bandwidth:
- 29,000 feed loads/sec x 20 photos x 200KB (compressed) = 116 GB/sec
- This is massive - CDN is essential
Cache needs:
- Hot user metadata and recent posts
- ~500GB-1TB Redis cluster for feed cachesStep 3: High-Level Architecture
┌─────────────────────────────────────────────────────────────┐
│ CLIENTS │
│ (iOS, Android, Web) │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ CDN (CloudFront) │
│ Serves photos, videos, static assets │
└───────────────────────┬─────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ API GATEWAY / LB │
│ Authentication, rate limiting, routing │
└────────┬──────────────┬──────────────┬──────────────────────┘
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Upload │ │ Feed │ │ Social │
│ Service │ │ Service │ │ Service │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ S3 │ │ Redis │ │ Postgres │
│ (media) │ │ (feeds) │ │ (social) │
└──────────┘ └──────────┘ └──────────┘Service Responsibilities
1. Upload Service
- Accept photo upload (pre-signed S3 URL)
- Generate multiple resolutions (thumbnail, medium, full)
- Extract EXIF data, apply filters
- Store metadata in database
- Trigger feed fan-out
2. Feed Service
- Generate personalized feed for each user
- Hybrid approach: pre-computed + on-read for celebrities
- Rank posts by engagement, recency, relationship
3. Social Service
- Follow/unfollow operations
- Like, comment, save operations
- Notification triggers
- Activity feedStep 4: Database Design
-- Users table (PostgreSQL - needs ACID)
CREATE TABLE users (
id BIGINT PRIMARY KEY,
username VARCHAR(30) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
bio TEXT,
avatar_url VARCHAR(500),
created_at TIMESTAMP DEFAULT NOW()
);
-- Posts table (PostgreSQL, partitioned by created_at)
CREATE TABLE posts (
id BIGINT PRIMARY KEY,
user_id BIGINT REFERENCES users(id),
caption TEXT,
media_url VARCHAR(500) NOT NULL,
media_type VARCHAR(10), -- 'photo' | 'video'
location POINT,
created_at TIMESTAMP DEFAULT NOW()
);
-- Follows (PostgreSQL)
CREATE TABLE follows (
follower_id BIGINT REFERENCES users(id),
followee_id BIGINT REFERENCES users(id),
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (follower_id, followee_id)
);
-- Index: (followee_id, follower_id) for "who follows me"
-- Feed cache (Redis sorted set)
-- Key: feed:{user_id}
-- Score: post timestamp (for ordering)
-- Value: post_idStep 5: Deep Dive — Feed Generation
This is the most critical and interesting part. The interviewer will spend the most time here.
Hybrid Fan-Out Approach
Same insight as Twitter: most users have few followers.
Regular users (< 50K followers): Fan-out on WRITE
- When they post, push post_id to all follower feed caches
- Fast reads: just fetch from cache
Celebrities (> 50K followers): Fan-out on READ
- Don't fan-out on write (too expensive)
- At read time, merge celebrity posts into cached feed
Feed Read Flow:
1. Fetch pre-computed feed from Redis (posts from regular users)
2. Fetch recent posts from followed celebrities
3. Merge both lists
4. Apply ranking algorithm
5. Return top 20 posts with full metadata
Feed Write Flow (regular user posts):
1. Post saved to database
2. Fan-out worker fetches follower list
3. For each follower: ZADD feed:{follower_id} {timestamp} {post_id}
4. Trim feed to last 500 entries (ZREMRANGEBYRANK)Feed Ranking
Instagram doesn't show purely chronological feeds.
Ranking signals:
- Recency (newer posts score higher)
- Relationship (users you interact with most)
- Engagement (posts with many likes/comments)
- Content type preference (photos vs videos)
- Session context (avoid showing same creator twice in a row)
Simplified scoring:
score = (recency_weight * recency_score)
+ (relationship_weight * interaction_score)
+ (engagement_weight * normalized_engagement)
In practice: ML model trained on user engagement data.
For the interview: mention the signals, explain trade-offs.Step 6: Media Upload Pipeline
Photo upload is async and multi-stage:
1. Client requests pre-signed S3 URL
2. Client uploads directly to S3 (bypass API servers)
3. S3 triggers Lambda/worker via event notification
4. Worker processes photo:
a. Validate image (format, size, content policy)
b. Strip EXIF data (privacy)
c. Generate resolutions:
- Thumbnail: 150x150
- Small: 320xN
- Medium: 640xN
- Large: 1080xN
- Original: preserved
d. Upload all versions to S3
e. Create CDN invalidation
f. Update post metadata with media URLs
g. Trigger feed fan-out
Total time: 2-5 seconds from upload to appearing in feeds.
Why pre-signed URLs?
- API servers don't handle large file transfers
- S3 handles upload directly (scales independently)
- Reduces API server CPU and bandwidthStep 7: Scaling Considerations
Read Path Optimization
Multi-layer caching:
1. CDN: Serves all photos (cache-control headers)
2. Application cache (Redis): Feed post IDs + user metadata
3. Database read replicas: For cache misses
Cache strategy:
- Feed cache: Write-through on fan-out
- User metadata: Cache-aside with 5min TTL
- Post metadata: Cache-aside with 1hr TTL
- Photo URLs: Immutable, cache forever at CDNDatabase Sharding
Shard by user_id:
- Posts table: shard by post author's user_id
- Follows table: shard by follower_id (optimizes "who do I follow")
- Feed cache: shard by feed owner's user_id
Why user_id?
- Feed reads access one user's data → single shard
- User profile page → single shard
- Cross-shard queries rare (search, explore)
Alternative: shard posts by post_id
- Better write distribution
- But feed generation requires scatter-gather across shards
- Worse for this use caseCommon Interview Follow-ups
How would you handle Stories?
Stories differ from posts:
- Ephemeral (24-hour TTL)
- Sequential viewing (not ranked)
- Stored separately from main feed
Implementation:
- Store in Redis with TTL of 24 hours
- Separate "stories feed" sorted by recency
- When user opens app: fetch story list + first frame
- Lazy-load remaining story media
Storage: Much cheaper since content expires.
Use Redis with expiry vs permanent database storage.How would you build Explore/Discovery?
Explore shows posts from users you DON'T follow.
Two-stage approach:
1. Candidate generation (offline):
- Collaborative filtering: "users like you liked these"
- Content-based: similar hashtags, locations, visual features
- Generate candidate pool of ~10K posts per user
2. Ranking (online):
- ML model scores each candidate
- Diversity filter (avoid too many of same type)
- Return top 50-100 for the session
Infrastructure:
- Offline pipeline generates candidates every few hours
- Online ranker is a lightweight ML model
- Results cached with 1-hour TTLKey Takeaways for the Interview
- Lead with the feed: Feed generation is the core technical challenge — show you understand fan-out trade-offs
- Separate media from metadata: S3 for photos, databases for everything else
- CDN is critical: Without it, you can't serve 116 GB/sec of photo downloads
- Hybrid fan-out: Push for regular users, pull for celebrities — this shows sophistication
- Pre-signed URLs: Mention this for uploads to show you understand real-world patterns
Practice This on HireReady
System design questions like this appear in Meta, Google, and Amazon interviews. Practice with our AI voice interviewer for real-time feedback on your communication and structure.