System Design Flashcards
Actual Designs
Ticketmaster requirements
Functional
Users should be able to view events
Users should be able to search for events
Users should be able to book tickets to events
– Below the line (out of scope): –
Users should be able to view their booked events
Admins or event coordinators should be able to add events
Popular events should have dynamic pricing
NFRs
The system should prioritize availability for searching & viewing events, but should prioritize consistency for booking events (no double booking)
The system should be scalable and able to handle high throughput in the form of popular events (10 million users, one event)
The system should have low latency search (< 500ms)
The system is read heavy, and thus needs to be able to support high read throughput (100:1)
– Below the line (out of scope): –
The system should protect user data and adhere to GDPR
The system should be fault tolerant
The system should provide secure transactions for purchases
The system should be well tested and easy to deploy (CI/CD pipelines)
The system should have regular backups
Ticketmaster final design
Deep Dives
How do we improve the booking experience by locking tickets?
How is the view API going to scale to support 10’s of millions of concurrent requests during popular events?
How will the system ensure a good user experience during high-demand events with millions simultaneously booking tickets?
How can you improve search to ensure we meet our low latency requirements?
How can you speed up frequently repeated search queries and reduce load on our search infrastructure?
Uber requirements
Functional
Riders should be able to input a start location and a destination and get a fare estimate.
Riders should be able to request a ride based on the estimated fare.
Upon request, riders should be matched with a driver who is nearby and available.
Drivers should be able to accept/decline a request and navigate to pickup/drop-off.
– Below the line (out of scope): –
Riders should be able to rate their ride and driver post-trip.
Drivers should be able to rate passengers.
Riders should be able to schedule rides in advance.
Riders should be able to request different categories of rides (e.g., X, XL, Comfort).
NFRS
The system should prioritize low latency matching (< 1 minutes to match or failure)
The system should ensure strong consistency in ride matching to prevent any driver from being assigned multiple rides simultaneously
The system should be able to handle high throughput, especially during peak hours or special events (100k requests from same location)
– Below the line (out of scope): –
The system should ensure the security and privacy of user and driver data, complying with regulations like GDPR.
The system should be resilient to failures, with redundancy and failover mechanisms in place.
The system should have robust monitoring, logging, and alerting to quickly identify and resolve issues.
The system should facilitate easy updates and maintenance without significant downtime (CI/CD pipelines).
Uber final design
Deep Dives
How do we handle frequent driver location updates and efficient proximity searches on location data?
How can we manage system overload from frequent driver location updates while ensuring location accuracy?
How do we prevent multiple ride requests from being sent to the same driver simultaneously?
How can we ensure no ride requests are dropped during peak demand periods?
How can you further scale the system to reduce latency and improve throughput?
Dropbox requirements
Functional
Users should be able to upload a file from any device
Users should be able to download a file from any device
Users should be able to share a file with other users and view the files shared with them
Below the line (out of scope):
Users should be able to edit files
Users should be able to view files without downloading them
NFRs
The system should be highly available (prioritizing availability over consistency).
The system should support files as large as 50GB.
The system should be secure and reliable. We should be able to recover files if they are lost or corrupted.
The system should make upload and download times as fast as possible (low latency).
Below the line (out of scope):
The system should have a storage limit per user
The system should support file versioning
The system should scan files for viruses and malware
Dropbox final design.
How can you support large files?
How can we make uploads and downloads as fast as possible?
How can you ensure file security?
Facebook News Feed requirements
Functional
Users should be able to create posts.
Users should be able to friend/follow people.
Users should be able to view a feed of posts from people they follow, in chronological order.
Users should be able to page through their feed.
Below the line (out of scope):
Users should be able to like and comment on posts.
Posts can be private or have restricted visibility.
NFRs
The system should be highly available (prioritizing availability over consistency). Tolerate up to 2 minutes for eventual consistency.
Posting and viewing the feed should be fast, returning in < 500ms.
The system should be able to handle a massive number of users (2B).
Users should be able to follow an unlimited number of users, users should be able to be followed by an unlimited number of users.
Facebook News Feed final design
How do we handle users who are following a large number of users?
How do we handle users with a large number of followers?
How can we handle uneven reads of Posts?
Design just UX & APIs for Facebook News Feed?
Leetcode requirements
Functional
Users should be able to view a list of coding problems.
Users should be able to view a given problem, code a solution in multiple languages.
Users should be able to submit their solution and get instant feedback.
Users should be able to view a live leaderboard for competitions.
Below the line (out of scope):
User authentication
User profiles
Payment processing
User analytics
Social features
NFRs
The system should prioritize availability over consistency.
The system should support isolation and security when running user code.
The system should return submission results within 5 seconds.
The system should scale to support competitions with 100,000 users.
Below the line (out of scope):
The system should be fault-tolerant.
The system should provide secure transactions for purchases.
The system should be well-tested and easy to deploy (CI/CD pipelines).
The system should have regular backups.
Leetcode final design
How will the system support isolation and security when running user code?
How would you make fetching the leaderboard more efficient?
How would the system scale to support competitions with 100,000 users?
How would the system handle running test cases?
Add Click Aggregator requirements.
Functional
Users can click on an ad and be redirected to the advertiser’s website
Advertisers can query ad click metrics over time with a minimum granularity of 1 minute
Below the line (out of scope):
Ad targeting
Ad serving
Cross device tracking
Integration with offline marketing channels
NFRs
Scalable to support a peak of 10k clicks per second
Low latency analytics queries for advertisers (sub-second response time)
Fault tolerant and accurate data collection. We should not lose any click data.
As realtime as possible. Advertisers should be able to query data as soon as possible after the click.
Idempotent click tracking. We should not count the same click multiple times.
Below the line (out of scope):
Fraud or spam detection
Demographic and geo profiling of users
Conversion tracking
Add Click Aggregator final design
How can we scale to support 10k clicks per second?
How can we ensure that we don’t lose any click data?
How can we prevent abuse from users clicking on ads multiple times?
How can we ensure that advertisers can query metrics at low latency?
Top k service requirements
Functional
Clients should be able to query the top K videos (max 1000) for a given time period.
Time periods should be limited to 1 {hour, day, month} and all-time.
Below the line (out of scope):
Arbitrary time periods.
Arbitrary starting/ending points (we’ll assume all queries are looking back from the current moment).
NFRs
We’ll tolerate at most 1 min delay between when a view occurs and when it should be tabulated.
Our results must be precise, so we should not approximate. (Note: This would be unusual for most production systems)
Our system should be able to handle a massive number (TBD - cover this later) of views per second.
We should support a massive number (TBD - cover this later) of videos.
We should return results within 10’s of milliseconds.
Our system should be economical. We shouldn’t need a 10k host fleet to solve this problem.
Top k service final design
Handling Time Windows
Large number of incoming requests