System Design Week 1 - Twitter Flashcards

Question 1

Q

What are Functional Requirements?

Answer

A

User account management support.
User should be able to post tweet with
140 characters
media
hashtags
user can follow unfollow users
user should be able to visualize feeds of tweets
users should be able to like, comment on tweet
trending hashtags for all geolocations should be available
Analytics as a background
search support

Question 2

Q

What are the Non-Functional Requirements?

Answer

A

Number of users 400 Million
number of tweets per sec 6000
number of feed views per sec 0.3 Million

Question 3

Q

What are the Microservices?

Answer

A

*Tweet ingestion service
*social graph
*feed generator
*feed dashboard
*endorsement service
Search
Trending hashtag
Analytics
Account management service

Question 4

Q

Create a Logical Diagram for Twitter.

Answer

A

https://drive.google.com/file/d/1zwfcMZPQKGm2Dzso8gjr-2-GMIf5Q45x/view?usp=sharing

Question 5

Q

What is the Schema?

Answer

A

https://drive.google.com/file/d/1QijaEz0hNuTDn-4ursycbQwciwJSSVk0/view?usp=sharing

Question 6

Q

What are the APIs?

Answer

A

insertTweetText(uid, content)
insertTweetMedia(userId, bytestream, offset, length)
likeTweet(tweetId)
dislikeTweet(tweetid)
replyTweet(tweetId, content)
search(text)
follow(userId)
unfollow(userId)
retweet(tweetId)

Question 7

Q

What is the Business Logic?

Answer

A

When end users post Tweets on Twitter, the load balancers forward these requests to the server handling the Tweet service. The server identifies the attachments (image, video) in the Tweet and stores them in the Blobstore. Text in the Tweets, user information, and all metadata are stored in the different databases. Data is stored in the Bigtable(Google Cloud Bigtable
), which is fully managed, easily scalable, and sorted keys. Assume the user sends a home timeline request using the /viewHome_timeline API. In a similar way, we will obtain the Top-k trends attached in the response to the timeline request.

Question 8

Q

What is the Microservices Design Consideration

Answer

A

CAP theorem
AP system, must explain why AP and not CP
Scaling
Must discuss the reasons which all applicable among below-
Scale for storage
Scale for throughput
Scale for API parallelization
Need to remove hotspot
Availability and Geo distribution
Sharding
Explanation why(or why not) sharding is required here
Vertical or horizontal sharding is required.
What will be the partition key?
Fixed number of shards or dynamic shard servers are required.
Consistent hashing must be mentioned with dynamic number of shards
Here,
for text → horizontal sharding
for media → horizontal+vertical
Replication
Required. Must explain reason
eg. for availability as well as throughput
Caching
Must explain well if caching is required or not.
If caching is required then which caching mechanism.
What is the eviction policy in cache.
API Parallelisation
Must explain well that API parallelization is required only when APIs are bulky.
Here,
for text → no
For media → maybe yes
GeoDistribution
Geo distribution of data is not required here. Must be called out if it is required or not, and why.
Load Balancing
Explanation of the need of load balancing for each service.
Purging/ Cleanup
Cleanup of data is required or not.

System Design Week 1 - Twitter Flashcards

(8 cards)