System DesignMarch 3, 2026·13 min read

Design Spotify: Music Streaming System Design Guide

Design audio streaming, recommendations, and playlist management for 500M+ users

Why Design Spotify?

Spotify tests unique challenges that differ from typical web applications: continuous media streaming, recommendation algorithms, and offline-first mobile design. It's increasingly popular in interviews at Spotify, Apple, Amazon Music, and general system design rounds.

Audio streaming: Adaptive bitrate, buffer management, gapless playback
Recommendations: Discover Weekly, Daily Mix, Radio — personalized at scale
Content delivery: 100M+ songs, served globally with low latency
Offline mode: Download management, DRM, storage optimization

Step 1: Requirements

Functional Requirements

Core features:
- Search for songs, artists, albums, playlists
- Stream audio in real-time
- Create and manage playlists
- Follow artists and users
- Personalized recommendations (home feed, Discover Weekly)

Out of scope:
- Podcasts
- Social features (group sessions)
- Artist upload portal
- Ads system (free tier)

Non-Functional Requirements

Scale:
- 500M monthly active users
- 200M concurrent streams during peak
- 100M+ song catalog
- Average song: 3.5 minutes, 10MB at 320kbps

Performance:
- Song playback starts within 200ms of pressing play
- Gapless playback between songs
- Search results in < 300ms
- 99.9% availability

Key insight: This is a READ-heavy streaming system.
Audio files are immutable — perfect for aggressive caching.

Step 2: Capacity Estimation

Storage:
- 100M songs x 10MB average = 1 PB of audio files
- Multiple quality levels (96, 160, 320 kbps) = ~3 PB total
- Metadata: 100M songs x 1KB = 100 GB

Bandwidth:
- 200M concurrent streams x 320kbps = 64 Tbps peak
- This is enormous — CDN is absolutely essential

Daily streams:
- 500M users x average 30 min/day = 250M hours of audio/day
- ~4B individual song plays per day

Step 3: High-Level Architecture

┌─────────────────────────────────────────────────────────────┐
│                      CLIENT APPS                             │
│            (iOS, Android, Web, Desktop)                      │
│        Local cache, playback engine, offline storage         │
└───────────────────────┬─────────────────────────────────────┘
                        │
            ┌───────────┼───────────┐
            │           │           │
            ▼           ▼           ▼
      ┌──────────┐ ┌────────┐ ┌──────────┐
      │   CDN    │ │  API   │ │  Search  │
      │ (audio)  │ │Gateway │ │  Service │
      └──────────┘ └───┬────┘ └──────────┘
                       │
         ┌─────────────┼─────────────┐
         │             │             │
         ▼             ▼             ▼
   ┌──────────┐ ┌──────────┐ ┌──────────┐
   │ Catalog  │ │ Playlist │ │  Reco    │
   │ Service  │ │ Service  │ │  Engine  │
   └────┬─────┘ └────┬─────┘ └────┬─────┘
        │             │             │
        ▼             ▼             ▼
   ┌──────────┐ ┌──────────┐ ┌──────────┐
   │ Metadata │ │ Playlist │ │  ML      │
   │ DB       │ │ DB       │ │ Pipeline │
   └──────────┘ └──────────┘ └──────────┘

Step 4: Deep Dive — Audio Streaming

Audio streaming is NOT like video streaming (no visual seeking).

Key differences from video:
- Audio files are small (3-10MB vs 1-10GB for video)
- Gapless playback matters (no buffering between songs)
- Bitrate adaptation is simpler (3 levels vs many for video)

Streaming flow:
1. Client requests song → API returns CDN URL + metadata
2. Client starts HTTP range request to CDN
3. CDN serves from edge cache (cache hit rate: 95%+)
4. Client buffers 10-30 seconds ahead
5. Adaptive bitrate: switch quality based on connection speed

File format:
- Ogg Vorbis (Spotify's primary) at 96/160/320 kbps
- Files pre-encoded and stored at all quality levels
- Each song stored as ~10 second chunks for efficient seeking

Gapless playback:
- While current song plays, pre-fetch next song's first chunk
- Start decoding next song before current ends
- Crossfade or gap removal at the audio decoder level

Client-side caching:
- LRU cache of recently played songs (configurable size)
- Downloaded songs for offline (encrypted with device key)
- Reduces CDN bandwidth by ~30%

Step 5: Deep Dive — Search

Search across 100M songs, 10M artists, billions of playlists.

Architecture:
- Elasticsearch cluster for full-text search
- Separate indexes: songs, artists, albums, playlists, users
- Custom ranking that combines text relevance + popularity

Search ranking signals:
1. Text match quality (exact > prefix > fuzzy)
2. Popularity (stream count, follower count)
3. Recency (newer releases boosted slightly)
4. Personalization (artists you listen to ranked higher)
5. Region (local artists boosted in their country)

Typeahead / autocomplete:
- Trie-based index for prefix matching
- Updated every few hours from search logs
- Top 5-10 suggestions in < 50ms

Handling misspellings:
- Levenshtein distance for fuzzy matching
- n-gram tokenization (break "metallica" into "met", "eta", "tal"...)
- Phonetic matching (Soundex/Metaphone) for name searches

Step 6: Recommendation Engine

Spotify's recommendations power Discover Weekly, Daily Mix, and Radio.

Three approaches combined:

1. Collaborative Filtering
   "Users who liked X also liked Y"
   - Build user-song matrix (500M users x 100M songs — sparse)
   - Matrix factorization (ALS algorithm) to find latent features
   - Find similar users, recommend their top songs

2. Content-Based Filtering
   "Songs that sound like what you play"
   - Audio features: tempo, key, energy, danceability, valence
   - Extracted via ML models from raw audio
   - Recommend songs with similar audio fingerprints

3. Natural Language Processing
   "What people say about this music"
   - Crawl blogs, reviews, social media for music descriptions
   - Build word vectors for each song/artist
   - Match user taste profile to song descriptions

Discover Weekly pipeline:
1. Run collaborative filtering weekly (batch job)
2. Filter: remove songs user already knows
3. Filter: ensure genre diversity
4. Rank by predicted listen probability
5. Generate 30 songs per user
6. Cache results, serve Monday morning

Infrastructure:
- Apache Spark for batch recommendation jobs
- Feature store (user listening history, song features)
- A/B testing framework for algorithm improvements

Step 7: Playlist Service

Playlists are surprisingly complex at scale:
- Billions of playlists
- Collaborative playlists (multiple editors)
- Ordering, deduplication, version history

Data model:
- Playlist metadata: id, name, owner, description, cover_image
- Playlist tracks: ordered list of (track_id, added_by, added_at)
- Store as ordered array in database

Collaborative playlists:
- Operational Transform or CRDT for concurrent edits
- In practice: last-write-wins with conflict resolution
- Version history allows undo

Popular playlists (millions of followers):
- Cache aggressively (read-heavy)
- Update propagation via pub/sub when playlist changes
- Eventual consistency is fine (30-second delay acceptable)

Key Takeaways for the Interview

Audio ≠ video: Different streaming challenges — gapless playback, smaller files, simpler bitrate adaptation
CDN is critical: 95%+ cache hit rate for audio. Pre-encode at all quality levels
Three recommendation approaches: Collaborative filtering + content-based + NLP, combined
Client caching saves bandwidth: Local LRU cache reduces CDN load by ~30%
Search is multi-signal: Text relevance + popularity + personalization + region

Practice This on HireReady

Streaming system design questions appear at Spotify, Apple, Netflix, and YouTube. Practice talking through your design with our AI voice interviewer.