Systems Design Foundations Flashcards

Question

For mobile, what are the two different notification tools?

Answer 1

Apple: Apple Push Notification Android: Firebase Cloud Messaging

Answer 2

GeoHashing Redis is an in-memory data store that supports geospatial data types and commands. It uses geohashing to encode latitude and longitude coordinates into a single string key, which is then indexed using a sorted set. This allows for efficient storage and querying of geospatial data.

Answer 3

Power of 1000 (1000^x) Number Prefix 0 Unit 1 Thousand Kilo 2 Million Mega 3 Billion Giga 4 Trillion Tera 5 Quadrillion Peta ______________________________________________ 1,000^1 = 1 kilobytes 1,000^2 = 1,000,000 = 1 megabyte 1,000^3 = 1,000,000,000 = 1 gigabyte 1,000^4 = 1,000,000,000,000 = 1 terabyte 1,000^5 = 1,000,000,000,000,000 = 1 petabyte _______________________________________________ 1 Million bytes is 1 megabyte.

Answer 4

Action Time Comparison Reading 1mb sequentially from memory 0.25ms Reading 1mb sequentially from SSD 1ms 4x memory Reading 1mb sequentially from spinning disk 20ms 20x SSD Round trip network latency CA to Netherlands 150ms

Answer 5

Item Size 1 hour video 2gb HD --> 1gb --> 500mb Low Res A small book of plain text 1mb A high-resolution photo 1mb A medium-resolution image (or a site layout graphic) 100kb

Answer 6

Metric Order of Magnitude Daily active users of major social networks O(1b) Hours of video streamed on Netflix per day O(100m) Google searches per second O(100k) Instagram Feed Requests per second O(50k) Size of Wikipedia O(100gb)

Answer 7

1. CAP theorem: Does this system prioritize availability or consistency? Note, that in some cases, the answer is different depending on the part of the system -- as you'll see is the case here. 2. Read vs write ratio: is this a read heavy system or write heavy? Are either notably heavy for any reason? 3. Fault Tolerance: Are there any interesting error scenarios? 4. Usage Patterns or Query access pattern: Is the access pattern of the system regular or are their patterns or bursts that require particular attention. For example, the holidays for shopping sites or popular events for ticket booking.

Answer 8

1 day = 24 hours/day × 60 minutes/hour × 60 seconds/minute = 86400 seconds/day 100,000 seconds in a day

Answer 9

Problem: If a file takes a long time to upload, we don't want a user to have to restart an upload on failure. Solution: Multi Part Upload * Break the file into chunks * Use fingerprinting to identify each chunk. This is just a hash of the chunk bytes Hash(bytes). * Send each chunk and keep track of success using the fingerprint.

Answer 10

1. Remove the key that is causing the issue. 2. Create a compound key partitionId:1-10 or use AdId:Userid 3. Backpressure. Force the producer to slow down.

Answer 11

(File size in MB * 8) / connection speed 1000 mb * 8 / 1000 mbps = 8 seconds 100 mb * 8 / 1000 mbps = .8 seconds 1000 mb * 8 / 100 mbps = 80 seconds

Answer 12

~65k because that is the number of ports on the server.

Answer 13

Layer 4 * Transport Layer * Routes using IP and Port * No inspection of data * Just forward the data * Use Case: The majority should just use this. Layer 7 * Application Layer. * Routes using data. * Use Case: if you need to use the data in the request to decide how to route. * It is more expensive because it requires more computing power and takes more time to process requests.

Answer 14

Separation of Concerns Chat with websockets * Websocket connection is setup by server and we can have the same logic for partitioning one place. Chat with SSE * You now need to have knowledge of partitioning in two places, backend for message processing and in your Laqyer 7 LB.

Answer 15

Serialization/Deserialization Serialization has a different meaning in transactions so we can also use Encoding and decoding.

Answer 16

USA: 300 million World: 8 billion

Answer 17

CRDTs Operational Transformation

Answer 18

1. Read Repairs. When clients are reading from others, they might see that another replica is missing a write. In this case, they can send the write. 2. Use an anti-entropy process which runs in the background attempting to add missing writes to all instances.

Answer 19

If R + W >= N then we have a Quorum. R is the number of nodes we read from W is the number of nodes with consistent writes. N is the number nodes. - If we have 2 + 2 >= 3 then we can support 1 node failure. - If we have 3 + 3 >= 5 then we can support 2 node failures.

Answer 20

Two writes are concurrent if neither happens before the other. Ie if the writes don’t know about each other. 3 possibilities 1. A comes before B 2. B comes before A 3. A and B are concurrent

Answer 21

- Leader/Follower: Easy to reason about, no collisions to worry about, if writer goes down failover will not be immediate, can lose data during failover. - Leaderless: Complicated to understand, many ways to get to quorum, always on, collisions are a problem.

Answer 22

Geohash: - MY ANSWER: I am always going to start with GeoHash because it makes my system simpler and easier to reason about. - Fixed precision. The world is geohashed already. - Can handle very high writes (Redis 1 million TPS) - Predefined so it is easier to understand, human readable. Quadtree: - You need to define what the quad tree represents on a map. - Adaptive/Dynamic precision. We can create a quadtree starting from anywhere. - Cannot handle high writes because need to keep tree balanced.

Answer 23

For video, the bitrate can be estimated by dropping 1 unit in size and doubling the number. So 1 GB file would be around the 2 megabits per second.

Systems Design Foundations Flashcards

(47 cards)