System Design Flashcards

Question 1

Q

Delivery Framework

Answer

A

Requirements
Core Entities
API or System Interface
Data Flow
High Level Design
Deep Dive

Question 2

Q

Requirements

Answer

A

Functional Requirements - “Users/clients should be able to…” Top 3
Non-functional Requirements - “System should be / should be able to…” Top 3
Capacity Estimations

Question 3

Q

Nonfunctional Requirements Checklist (8)

Answer

A

CAP theorem, but for distributed systems really just CA, P is a given
Environment constraints, ie battery life or limited memory
Scalability, unique reqs such as bursts traffic or read/write ratio
Latency, specifically for anything with meaningful computation
Durability, how important that data is not lost
Security, ie data protection, access control
Fault tolerance, ie redundancy, failover, recovery mechanisms
Compliance, ie legal or regulatory requirements or standards

Question 4

Q

Bytes to store data

Answer

A

ASCII - 1 byte
Unicode - 2 bytes

Question 5

Q

Split seconds

Answer

A

Millisecond (ms) 1/1000
Microsecond (us) 1/1,000,000
Nanoseconds (ns) 1/1,000,000,000

Question 6

Q

Read latency

Answer

A

Memory
1mb/.25ms, 4gb/s

SSD (4x memory)
1mb/ms, 1gb/s

Disk (20x SSD)
1mb/20ms

Worldwide trip
6/s

Question 7

Q

Request Calculations by second

Answer

A

2.5 mil seconds per year

1 million per month = .4/s
2.5 million per month = 1/s
10 million per month = 4/s
100 million per month = 40/s
1 billion per month = 400/s

Question 8

Q

Storage estimates:
2 hr movie
Small plain text book
High res photo
Med res image

Answer

A

Movie 1gb
Book 1mb
Photo 1mb
Med res image 100kb

Question 9

Q

DB Writes vs Reads

Answer

A

Write is 40x more expensive than read

Question 10

Q

Core Entities

Answer

A

2 min.
What the API will exchange and will persist in data model. Ex user/tweet,follow for twitter.

Bullet list

Question 11

Q

API or System Interface

Answer

A

RESTful or GraphQL

Endpoints with path and parameters

Question 12

Q

Data Flow

Answer

A

Actions or processes that the system performs on the input to produce the desired outputs

Question 13

Q

Core Concepts

Answer

A

Scaling - work distribution and data distribution

Consistency

Locking

Indexing

Communication Protocols

Security - authentication and authorization, encryption, data protection

Monitoring - infrastructure, system level, application level

Question 14

Q

Key Technologies

Answer

A

Core DB
Blob storage
Search optimized DB
API gateway
Load balancer
Queue
Streams / event sourcing
Distributed lock
Distributed cache
CDN

Question 15

Q

Patterns

Answer

A

DB backed CRUD with caching

Async job worker pool

2 stage architecture

Event driven architecture

Durable job processing

Proximity based services

Question 16

Q

Core API - high level overview

Answer

A

“Our Core API uses a layered .NET architecture, deployed in EKS. Controllers
handle HTTP routing, Services handle business logic, and a Data layer interacts
with Aurora and Redis. This lets us scale the service horizontally while keeping
the codebase maintainable.”

Question 17

Q

Core API - layered architecture justification

Answer

A

“We wanted to separate concerns—controllers focus on HTTP requests, services
encapsulate domain rules, and our data layer deals with Aurora and caching. This
approach cuts down on coupling and makes it easier to adapt or extract
microservices down the road.”

Question 18

Q

Core processor - explanation

Answer

A

“We have a central ETL pipeline—the Core Processor—which ingests data from
multiple providers, stores raw payloads in S3, and then transforms/loads it into Aurora.

Tasks run on a cron based scheduler and it retries on failure with exponential backoff, ensuring resilience even if a provider is temporarily down”

Question 19

Q

Core API - why K8s?

Answer

A

“Kubernetes gave us automated scaling and rolling updates out of the box. We
can spin up more pods during major sporting events and scale back when traffic is
low, all while ensuring near-zero downtime.”

Question 20

Q

Core API - EKS rolling updates

Answer

A

“We use a rolling update strategy so that when deploying a new version of the
API, only one old pod goes down at a time—our system stays online, and if
something fails, we can roll back quickly.”

Question 21

Q

Core API - stateless pods

Answer

A

“Even though our application manages a lot of data, we designed each pod to be stateless. Any persistent data—sessions, user info, or stats—resides in Aurora, Redis, or S3.

That means losing a pod doesn’t risk losing data.”

Question 22

Q

Core API - Ingress and Helm templating

Answer

A

“We have an internal ALB that terminates TLS and checks liveness via /health.
The ALB is configured via Ingress annotations in our Helm chart, ensuring only
healthy pods receive requests. We define everything in Helm charts, from replicas
and resource limits to Ingress rules. Environment-specific overrides like values-
stage.yaml and values-prod.yaml let us run the same code in staging vs.
production with minimal overhead.”

Question 23

Q

Core API - CI/CD pipeline

Answer

A

“We use CircleCI to build Docker images, run tests, push the image to ECR, then
automatically update our Helm chart. If linting or validation fails, the deployment
never proceeds—meaning we catch issues before they hit production.”

Question 24

Q

Core API - automatic rollbacks

Answer

A

“Our pipeline can roll back a Helm release if we detect a spike in 500 errors or
failing health checks. That safety net lets us move fast and confidently ship
updates.”

Question 25

Q

Core API - environment specific builds

Answer

A

“For each commit on the ‘master’ branch, CircleCI sets DOTNETCORE_ENVIRONMENT=production and deploys to our production
cluster. For ‘stable’, it uses stage—we keep these pipelines consistent, ensuring
minimal drift.”

Question 26

Q

Core API - Redis caching

Answer

A

“We cache frequently requested data in Redis—like top odds or event stats—for short TTLs.
This offloads read traffic from Aurora and drastically reduces latency on hot endpoints.”

Question 27

Q

Core API - in memory caching

Answer

A

“Each pod has an in-memory cache for micro-optimizations, but it’s not critical if a pod restarts—it’s purely ephemeral. That’s a classic stateless approach, as all permanent state lives in external data stores.”

Question 28

Q

Core API - Metrics

Answer

A

“We used Prometheus and Grafana for real-time visibility into common and custom metrics. That data helps us spot anomalies or performance regressions fast, and Grafana let us set events to trigger slack notifications for the proper team”

Question 29

Q

Core API - Rollbar

Answer

A

“Any exception in the Core API automatically logs to Rollbar and critical errors trigger slack notifications to the proper teams. During a major sporting event, if we see a surge of 500 errors, we can quickly pinpoint which endpoint or DB call is failing.”

Question 30

Q

Core API - latency tracking

Answer

A

“We keep a histogram of HTTP request durations. By tracking P95 and P99 latencies, we ensure that even our worst-case requests stay within acceptable bounds, especially during heavy game traffic.”

Question 31

Q

Core API - CAP theorem

Answer

A

“We operate in a distributed AWS environment, so partition tolerance is mandatory.

We typically favor high availability over strict consistency by reading from Aurora replicas—though the primary itself is strongly consistent.

That means brief eventual consistency for read workloads, which is acceptable for this domain.”

Question 32

Q

Core API - consistency

Answer

A

“We do strongly consistent writes to Aurora’s primary. But for reads—especially from replicas or caches—we accept short-lived eventual consistency.
The lag is usually small, and it’s worth it to maintain high throughput under load.”

Question 33

Q

Core API - security

Answer

A

“All local developers must assume an MFA-secured AWS role. Secrets are stored in Parameter Store or K8s secrets, meaning we never expose plain-text creds in code or logs.”

Question 34

Q

Core API - internal ALB

Answer

A

“We use an internal ALB for traffic, so it’s not publicly accessible. On top of that, Kubernetes role based access control restricts who can modify deployments or read secrets, ensuring a tight security posture.”

Question 35

Q

Core API - estimating capacity

Answer

A

“We measure requests-per-second during major sporting events and compare it to CPU/memory usage. If we see pods hitting 80% CPU or if DB queries approach saturation, we scale out.
Aurora read replicas handle the read spikes, and Redis further reduces direct DB hits.”

Question 36

Q

Core API - main bottleneck

Answer

A

“Ultimately, Aurora can become the bottleneck for heavy writes. We mitigate that with indexing, short caches, and read replicas. If needed, we could further partition data, but so far Aurora’s performance has met our needs.

Nevertheless, I recently built an archiving task that runs nightly to archive all market lines from over 18 months in the past, which included a few hundred million records from a terabytes size table”

Question 37

Q

PSO - centralized data for all properties

Answer

A

Problem: Multiple newly acquired properties each ingested sports data differently, creating inconsistencies.

Solution: We built a Core API on .NET, containerized on EKS, and standardized data ingestion via the Core Processor.

Outcome: We reduced duplication, established a single source of truth, and scaled seamlessly for peak sports seasons.

Question 38

Q

PSO - zero downtime deployments

Answer

A

Problem: Rolling updates were risky with older infrastructure, often causing partial outages.

Solution: By using Helm with rolling updates and readiness probes, we can gradually shift
traffic to new pods while old pods are drained.

Outcome: Near-zero downtime deploys and the ability to roll back quickly if metrics or logs show a spike in errors.

Question 39

Q

PSO - real-time observability

Answer

A

Problem: We lacked insight into production performance; debugging took hours.

Solution: We integrated Telegraf for metrics and Rollbar for error logs.

Outcome: The moment error rates spike, we get Slack alerts and can see exactly which
endpoints or queries are failing, cutting response times in half.

Question 40

Q

Single Responsibility Principle

Answer

A

Classes should have a single responsibility, and only one reason to change. Everything it does should be very closely related so class isn’t bloated.

Question 41

Q

Open-Closed Principle

Answer

A

Code should be open to extension, but closed to modification. Instead of modifying we can make a subclass that inherits from the base, extension methods

Question 42

Q

Liskov Substitution Principle

Answer

A

A child class should be able to do everything a parent class can.

Question 43

Q

Interface Segregation Principle

Answer

A

Client should never be forced to implement an interface it doesn’t use or forced to depend on methods they don’t use

Question 44

Q

Dependency Inversion

Answer

A

High level modules shouldn’t depend on low level modules. Both should depend on abstraction.

Question 45

Q

Pattern: CRUD service

Answer

A

Most common and simple. Backed up by db and cache. Fronted by api and lb.

Client -> API -> Load Balancer -> Service -> Cache -> Database

Question 46

Q

Pattern: async job worker pool

Answer

A

For systems that have lots of processing and can tolerate some delay. Good for processing images and videos.

Queue -> Workers -> Database

Question 47

Q

Pattern: 2 stage architecture

Answer

A

Good for recommendations, search, route planning. Fast but inaccurate stage then slow but precise to finish off.

Ranking service (slow but precise) -> Vector DB (fast but inaccurate) <- Blob storage

Question 48

Q

Pattern: Event drive architecture

Answer

A

Centered around events, good for systems that need to react to changes in real-time, ex E-Commerce when an order is placed

Event producers -> event routers/brokers (Kafka or eventbridge) -> event consumers to process the events and take necessary actions

Question 49

Q

Pattern: Durable job processing

Answer

A

For long running jobs. Store jobs in something like Kafka, then pool of workers process the jobs. They periodically checkpoint progress to a durable log and if a worker crashes another can pick up where it left off.

Phase 1 workers -> distributed, durable log -> Phase 2 workers -> durable log -> Phase 3 workers

Question 50

Q

Pattern: Proximity based services

Answer

A

Ex Uber

Divide geographical area into regions and index entities within the regions. Allows system to exclude vast areas that don’t contain relevant entities