Mini Design Breakdowns Easy Flashcards
How do you design a scalable URL Shortener (Bit.ly clone)?
Hashing Algorithm → Convert long URL into a unique 6-character short URL
Database for Storage → Use a key-value store (Redis or MySQL with sharding)
Load Balancer → Distribute traffic across multiple servers
Cache Layer → Use Redis to cache frequently requested URLs
How do you design a Scalable Notification System
Message Queue (Kafka, SQS, RabbitMQ) → Decouple notification delivery
Push Notifications & WebSockets → Real-time user alerts
Database for Persistence → Store notifications in NoSQL (e.g., DynamoDB, MongoDB)
Priority Queue for Critical Alerts → Ensure urgent messages are sent first
User Preferences Management → Allow users to control notification settings
Deep Dives for Senior+
How do you ensure exactly-once delivery? → Use idempotency keys with deduplication
How to handle multi-channel notifications? → Unified Notification Gateway (email, SMS, push)
How do you scale real-time notifications? → Partition WebSocket connections
How do you deal with noisy notifications? → Implement batching and frequency caps
How do you design a Real-Time Chat Application
WebSockets for Low-Latency Communication → Maintain persistent connections
Message Queue (Kafka, SQS) for Reliability → Ensure message delivery
Database (Cassandra, DynamoDB) for Persistence → Store chat history
Eventual Consistency for Message
Syncing → Handle offline messages
Presence System (Redis, Zookeeper) → Track user online/offline status
Deep Dives for Senior+
How do you handle offline message delivery? → Use message queue for delayed delivery
How to handle high fanout (e.g., group chats)? → Use multicast strategies
How do you prevent message loss? → Implement at-least-once or exactly-once semantics
How to handle read receipts efficiently? → Use separate microservice for metadata tracking
How do you design a Search Autocomplete System
Trie Data Structure for Fast Prefix Search
Ranking Algorithm (TF-IDF, PageRank) → Prioritize search results
Elasticsearch or Solr for Full-Text Search
Sharding & Index Partitioning for Scalability
How do you update indexes in real-time? → Use incremental indexing
How to optimize prefix search performance? → Bloom filters and tries
How do you handle typos (fuzzy matching)? → Levenshtein distance
How do you make search ranking customizable per user? → Personalized search ranking
How do you design Facebook News Feed?
Feed Ranking Algorithm → Personalize content based on engagement signals
Graph-based Recommendation System
→ Surface posts from relevant connections
Distributed Storage for Content → Store and retrieve posts efficiently
Push vs. Pull Model → Optimize for performance based on active/inactive users
Edge Ranking Pipeline → Rank content using ML models
Incremental Updates vs. Full Recompute → Ensure real-time freshness
How do you handle personalized ranking at scale? → Feature engineering, embeddings
How do you reduce load when computing feeds? → Fanout-on-read vs. Fanout-on-write
How do you handle abusive/spam content at scale? → AI/ML for content moderation
How do you reduce latency for high-profile users? → Prioritized caching for influencers
function rankFeed(posts) {
return posts.sort((a, b) => (b.likes + b.comments * 2) - (a.likes + a.comments * 2));
}
How do you design Facebook Live Streaming?
Live Video Encoding & Transcoding → Ensure multiple resolutions
Low-Latency Streaming Protocols → WebRTC, HLS, RTMP
Commenting & Reactions in Real-Time → WebSockets for chat
CDN for Global Distribution → Cache and serve live streams
Scalability for Viral Streams → Auto-scaling servers
How do you optimize for low-latency streaming? → Adaptive bitrate streaming
How do you prevent abuse in live chats? → AI-based moderation
How do you handle sudden spikes in viewership? → Dynamic auto-scaling
Design Yelp
Geospatial Search Index → Search businesses by location
Review & Rating System → Store user reviews and compute ratings
Ranking Algorithm → Sort businesses based on relevance and distance
Spam & Fake Review Detection → NLP-based filtering
🔍 Deep Dives
How do you optimize location-based search? → Use QuadTrees or KD-Trees
How do you prevent fake reviews? → Sentiment analysis, user reputation scoring
Design Leetcode (Coding Platform)
Question Storage & Tagging → Categorize problems by difficulty and topic
Online Code Execution Engine → Run user-submitted code in a sandbox
Leaderboard & Contest System → Store rankings and performance metrics
Deep Dives
How do you execute user code securely? → Use isolated Docker containers
How do you prevent DDoS on coding servers? → Implement rate-limiting
- Design Strava (Fitness Tracking)
GPS Data Collection → Store user workout routes
Performance Metrics Computation → Calculate pace, speed, elevation
Leaderboards & Challenges → Track user performance over time
Deep Dives
How do you handle GPS data outliers? → Use Kalman filters to smooth data
How do you store time-series fitness data efficiently? → Use TSDB like InfluxDB
Design Dropbox (File Storage & Syncing)
Chunked File Storage → Store large files in chunks
Versioning & Conflict Resolution → Handle edits from multiple devices
File Syncing Algorithm → Detect file changes efficiently
How do you optimize large file uploads? → Use resumable uploads with checksums
How do you prevent file conflicts in shared folders? → Implement last-writer-wins strategy
Design Gopuff (On-Demand Delivery)
Order Management System → Track inventory and purchases
Delivery Routing Algorithm → Optimize driver routes for efficiency
Warehouse Network → Distribute items across multiple fulfillment centers
Deep Dives
How do you balance stock across warehouses? → Use predictive demand forecasting
How do you reduce delivery wait times? → Implement dynamic batching