System Design Week 1 - URL Shortener Flashcards
What are the Functional Requirements?
- Long URL -> Generate unique short URL
- Short URL ->Retrieve long URL from short URL
- Customized URLs: Users should be able to generate custom short links for their URLs using our system.
- TTL of URLs
What are the Non-Functional Requirements?
- Number of short URLs getting generated per second.
- Number of url retrievals per second
- The length of the short URL - let’s start with 7
- Character set in the short URL - A-Z,a-z,0-9
- Availability: Our system should be highly available
- Scalability: Our system should be horizontally scalable with increasing demand.
- Latency: The system should perform at low latency
- Unpredictability: From a security standpoint, the short links generated by our system should be highly unpredictable.
List down the microservices covering all functional requirements
Url Shortener Service
What is the Logical Diagram & flow of data?
https://drive.google.com/file/d/1yUQyWilq4dQbHQF2ScWDAX1HXDeNbIqr/view?usp=sharing
What is the Database Schema?
Table1 : Id, longUrl, shortId, createTime, TTL, user_id
Table2: id, username, mail_id, metainfo
per write request following data
long url 400byte
shortid 10byte
creationTime 10byte
TTL 10byte
userId 10byte
Total = ~500byte(0.5kb) per record
What is the API Design?
create(long URL)
read(short URL)
delete(shortUrl) //not necessary
What is the Business logic for the problem statement?
- Convert unique id to a /7-character string (n calculated based on data estimation)
- Base 62 encoding decoding
- Why 7-character as a footnote
https://drive.google.com/file/d/16DbXmDkgBC-A2ISsbQkWsUXkyCPcIV2B/view?usp=sharing
Core logic to get a short url -
Short URL generator: Short URL generator will comprise a building block and an additional component:
A sequencer to generate unique IDs
A Base-58 encoder to enhance the readability of the short URL
We built a sequencer in our building blocks section to generate 64-bit unique numeric IDs. However, our proposed design requires 64-bit alphanumeric short URLs in base-58. To convert the numeric (base-10) IDs to alphanumeric (base-58), we’ll need a base-10 for the base-58 encoder.
Shortening: Each new request for short link computation gets forwarded to the short URL generator (SUG) by the application server. Upon successful generation of the short link, the system sends one copy back to the user and stores the record in the database for future use.
What is the Design Consideration for the problem statement?
CAP theorem
AP system, must explain why AP and not CP
Scaling
Must discuss the reasons which all applicable among below-
Scale for storage
Scale for throughput
Scale for API parallelization
Need to remove hotspot
Availability and Geo distribution : must be available but geodistribution is not a need.
Sharding
Explanation why(or why not) sharding is required here
Vertical or horizontal sharding is required.(here horizontal sharding)
What will be the partition key?(hash function or a range function)
Fixed number of shards or dynamic shard servers are required. (dynamic)
Consistent hashing must be mentioned with dynamic number of shards
Replication
Required. Must explain reason
eg. for availability as well as throughput
Caching
Must explain well if caching is required or not.
If caching is required then which caching mechanism.
What is the eviction policy in cache.
API Parallelisation
Must explain well that API parallelization is required only when APIs are bulky.
GeoDistribution
Geo distribution of data is not required here. Must be called out if it is required or not, and why.
Load Balancing
Explanation of the need of load balancing for each service.
Purging/ Cleanup
Cleanup of data is required or not.
Create the architectural diagram for URL shortener