Additional System Design Flashcards

1
Q

How can caches go wrong?

A
  1. Thunder herd problem - when large number of keys in the cache expire at the same time. Then the query requests directly hit the database which overloads the database.
    Mitigate this issue - 1. Avoid setting the same expiry time for the keys, adding a random number in the config. 2- allow only the core business data to hit the database & prevent non core data to access the dB until the cache is back up.
  2. Cache penetration - when the key doesn’t exist in cache or the database. The app can’t retrieve relevant data from the dB to update the cache. This problem creates alot of pressure for cache and database.
    Solution - 1. Create null value for non-exisitng keys, avoid hitting the dB. 2. Use a bloom filter to check key existence first and if the key doesn’t exist, avoid hitting the dB.
  3. Cache breakdown - similar to the thunder herd problem. It happens when a hot key expires. A large number of requests hits the dB. Since the hot keys take up 80% of the queries, we don’t set an expiry time for them.
  4. Cache crash - this happens when the cache is down and all the requests go to the database.
    Solution - 1. Set up a circuit breaker, and when the cache is down, the app services can’t visit the cache or the dB. 2. Set up a cluster for the cache to improve cache availability.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

4 most popular cases for UDP (user datagram protocol)

A

UDP is used in various software architectures for its simplicity, speed and low overhead compared to other protocols like TCP.

  1. Live video streaming - many VoIP & video conferencing apps leverage UDP due to its lower overhead & ability to tolerate packet loss. Real-time communication benefits from UDP’s reduced latency compared to TCP.
  2. DNS (domain name server) - DNS queries typically use UDP for their fast and lightweight nature. Although DNS can also use TCP for large responses or zone transfers, most queries are handled via UDP.
  3. Market data multicast - in low latency trading, UDP is utilised for efficient market data delivery to multiple recipients simultaneously.
  4. IoT - UDP is often used in IoT devices for communications, sending small packets of data between devices.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does a typical push notification system work?

A

The architecture of a notification system that covers major notification channels:
- In app notifications
- Email notifications
- SMS & OTP notifications
- Social media pushes

Steps
1. The business services sends notifications to the notification gateway. The gateway can handle 2 modes. One mode receives one notification each time, and the other mode receives notifications in batches.

  1. The notification gateway forwards the notifications to the distribution service, where the messages are validated, formatted and scheduled based on settings. The notification template repository allows users to pre-define the message format. The channel preference repository allows users to pre-define the preferred delivery channels.
  2. The notifications are then sent to the routers, normally message queues.
  3. The channel services communicate with various internal and external delivery channels, including in-app notifications, email delivery, SMS delivery, and social media apps.
  4. The delivery metrics are captured by the notification tracking and analytics service, where the operations team can view the analytical reports & improve user experiences.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Have you heard of the 12-Factor App?

A

The “12 Factor App” offers a set of best practices for building modern software applications.
Following these 12 principles can help developers and teams in building reliable, scalable, and manageable applications.

Here’s a brief overview of each principle:
1. Codebase:
Have one place to keep all your code, and manage it using version control like Git.

  1. Dependencies:
    List all the things your app needs to work properly, and make sure they’re easy to install
  2. Config:
    Keep important settings like database credentials separate from your code, so you can change them without rewriting code.
  3. Backing Services:
    Use other services (like databases or payment processors) as separate components that
    your app connects to.
  4. Build, Release, Run:
    Make a clear distinction between preparing your app, releasing it, and running it in
    production.
  5. Processes:
    Design your app so that each part doesn’t rely on a specific computer or memory. It’s like
    making LEGO blocks that fit together.
  6. Port Binding:
    Let your app be accessible through a network port, and make sure it doesn’t store critical
    information on a single computer.
  7. Concurrency:
    Make your app able to handle more work by adding more copies of the same thing, like
    hiring more workers for a busy restaurant.
  8. Disposability:
    Your app should start quickly and shut down gracefully, like turning off a light switch instead of yanking out the power cord.
  9. Dev/Prod Parity:
    Ensure that what you use for developing your app is very similar to what you use in
    production, to avoid surprises.
  10. Logs:
    Keep a record of what happens in your app so you can understand and fix issues, like a
    diary for your software
  11. Admin Processes:
    Run special tasks separately from your app, like doing maintenance work in a workshop
    instead of on the factory floor.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Visualizing a SQL query

A

SQL statements are executed by the database system in several steps, including:
- Parsing the SQL statement and checking its validity
- Transforming the SQL into an internal representation, such as relational algebra
- Optimizing the internal representation and creating an execution plan that utilizes index
information
- Executing the plan and returning the results

SELECT
FROM
JOIN
ON
WHERE
GROUP BY
HAVING
ORDER BY
LIMIT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does Redis architecture evolve?

A

Redis is a popular in-memory cache. How did it evolve to the architecture it is today?

🔹 2010 - Standalone Redis
When Redis 1.0 was released in 2010, the architecture was quite simple. It is usually used as a cache to the business application.

However, Redis stores data in memory. When we restart Redis, we will lose all the data and the traffic directly hits the database.

🔹 2013 - Persistence
When Redis 2.8 was released in 2013, it addressed the previous restrictions. Redis introduced RDB in-memory snapshots to persist data. It also supports AOF (Append-Only-File), where each write command is written to an AOF file.

🔹 2013 - Replication
Redis 2.8 also added replication to increase availability. The primary instance handles real-time read and write requests, while replica synchronizes the primary’s data.

🔹 2013 - Sentinel
Redis 2.8 introduced Sentinel to monitor the Redis instances in real time. is a system designed to help managing Redis instances. It performs the following four tasks: monitoring, notification, automatic failover and configuration provider.

🔹 2015 - Cluster
In 2015, Redis 3.0 was released. It added Redis clusters. A Redis cluster is a distributed database solution that manages data through sharding. The data is divided into 16384 slots, and each node is responsible for a portion of the slot.

🔹 Looking Ahead
Redis is popular because of its high performance and rich data structures that dramatically reduce
the complexity of developing a business application.
In 2017, Redis 5.0 was released, adding the stream data type.

In 2020, Redis 6.0 was released, introducing the multi-threaded I/O in the network module. Redis model is divided into the network module and the main processing module. The Redis developers the network module tends to become a bottleneck in the system.
Over to you - have you used Redis before? If so, for what use case?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does “scan to pay” work?
How do you pay from your digital wallet, such as Paypal, Venmo, Paytm, by scanning the QR code?

A

To understand the process involved, we need to divide the “scan to pay” process into two
sub-processes:

  1. Merchant generates a QR code and displays it on the screen
  2. Consumer scans the QR code and pays

Here are the steps for generating the QR code:
1. When you want to pay for your shopping, the cashier tallies up all the goods and calculates the total amount due, for example, $123.45. The checkout has an order ID of SN129803. The cashier clicks the “checkout” button.
2. The cashier’s computer sends the order ID and the amount to PSP.
3. The PSP saves this information to the database and generates a QR code URL.
4. PSP’s Payment Gateway service reads the QR code URL.
5. The payment gateway returns the QR code URL to the merchant’s computer.
6. The merchant’s computer sends the QR code URL (or image) to the checkout counter.
7. The checkout counter displays the QR code.
These 7 steps complete in less than a second.

Now it’s the consumer’s turn to pay from their digital wallet by scanning the QR code:
1. The consumer opens their digital wallet app to scan the QR code.
2. After confirming the amount is correct, the client clicks the “pay” button.
3. The digital wallet App notifies the PSP that the consumer has paid the given QR code.
4. The PSP payment gateway marks this QR code as paid and returns a success message to the consumer’s digital wallet App.
5. The PSP payment gateway notifies the merchant that the consumer has paid the given QR code.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do Search Engines Work?

A

● Step 1 - Crawling
Web Crawlers scan the internet for web pages. They follow the URL links from one page to
another and store URLs in the URL store. The crawlers discover new content, including web
pages, images, videos, and files.

● Step 2 - Indexing
Once a web page is crawled, the search engine parses the page and indexes the content
found on the page in a database. The content is analyzed and categorized. For example,
keywords, site quality, content freshness, and many other factors are assessed to
understand what the page is about.

● Step 3 - Ranking
Search engines use complex algorithms to determine the order of search results. These
algorithms consider various factors, including keywords, pages’ relevance, content quality,
user engagement, page load speed, and many others. Some search engines also personalize results based on the user’s past search history, location, device, and other personal factors.

● Step 4 - Querying
When a user performs a search, the search engine sifts through its index to provide the most relevant results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The Payments Ecosystem

A

How do fintech startups find new
opportunities among so many payment companies? What do PayPal, Stripe, and Square do exactly?

Steps 0-1: The cardholder opens an account in the issuing bank and gets the debit/credit card. The merchant registers with ISO (Independent Sales Organization) or MSP (Member Service Provider) for in-store sales. ISO/MSP partners with payment processors to open merchant accounts.

Steps 2-5: The acquiring process.
The payment gateway accepts the purchase transaction and collects payment information. It is then sent to a payment processor, which uses customer information to collect payments. The acquiring processor sends the transaction to the card network. It also owns and operates the merchant’s account during settlement, which doesn’t happen in real-time.

Steps 6-8: The issuing process. The issuing processor talks to the card network on the issuing bank’s behalf. It validates and operates the customer’s account.

I’ve listed some companies in different verticals in the diagram. Notice payment companies usually start from one vertical, but later expand to multiple verticals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cloud Cost Reduction Techniques

A

Irrational Cloud Cost is the biggest challenge many organizations are battling as they navigate the complexities of cloud computing.
Efficiently managing these costs is crucial for optimizing cloud usage and maintaining financial health. The following techniques can help businesses effectively control and minimize their cloud expenses.

  1. Reduce Usage:
    Fine-tune the volume and scale of resources to ensure efficiency without compromising on the performance of applications (e.g., downsizing instances, minimizing storage space, consolidating services).
  2. Terminate Idle Resources:
    Locate and eliminate resources that are not in active use, such as dormant instances, databases, or storage units.
  3. Right Sizing:
    Adjust instance sizes to adequately meet the demands of your applications, ensuring neither underuse nor overuse.
  4. Shutdown Resources During Off-Peak Times: Set up automatic mechanisms or schedules for turning off non-essential resources when they are not in use, especially during low-activity periods.
  5. Reserve to Reduce Rate:
    Adopt cost-effective pricing models like Reserved Instances or Savings Plans that align with your specific workload needs.

Bonus Tip: Consider using Spot Instances and lower-tier storage options for additional cost savings.

  1. Optimize Data Transfers:
    Utilize methods such as data compression and Content Delivery Networks (CDNs) to cut down on bandwidth expenses, and strategically position resources to reduce data transfer costs, focusing on intra-region transfers.
    Over to you: Which technique fits in well with your current cloud infra setup?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do live streaming platforms like YouTube Live, TikTok

A

Live, or Twitch work?
Live streaming is challenging because the video content is sent over the internet in near real-time. Video processing is compute-intensive. Sending a large volume of video content over the internet
takes time. These factors make live streaming challenging.

The diagram below explains what happens behind the scenes to make this possible.

Step 1: The streamer starts their stream. The source could be any video and audio source wired up to an encode

Step 2: To provide the best upload condition for the streamer, most live streaming platforms provide point-of-presence servers worldwide. The streamer connects to a point-of-presence server closest to them.

Step 3: The incoming video stream is transcoded to different resolutions, and divided into smaller video segments a few seconds in length.

Step 4: The video segments are packaged into different live streaming formats that video players can understand. The most common live-streaming format is HLS, or HTTP Live Streaming.

Step 5: The resulting HLS manifest and video chunks from the packaging step are cached by the CDN.

Step 6: Finally, the video starts to arrive at the viewer’s video player.

Step 7-8: To support replay, videos can be optionally stored in storage such as Amazon S3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

9 Best Practices for Building Microservices

Creating a system using microservices is extremely difficult unless you follow some strong principles.

A

1 - Design For Failure
A distributed system with microservices is going to fail. You must design the system to tolerate failure at multiple levels such as infrastructure, database, and individual services. Use circuit breakers, bulkheads, or graceful degradation methods to deal
with failures.

2 - Build Small Services
A microservice should not do multiple things at once. A good microservice is designed to do one thing well.

3 - Use lightweight protocols for communication. Communication is the core of a distributed system. Microservices must talk to each other using lightweight protocols. Options include REST, gRPC, or
message brokers.

4 - Implement service discovery
To communicate with each other, microservices need to discover each other over the network. Implement service discovery using tools such as Consul, Eureka, or Kubernetes Services

5 - Data Ownership
In microservices, data should be owned and managed by the individual services. The goal should be to reduce coupling between services so that they can evolve independently.

6 - Use resiliency patterns
Implement specific resiliency patterns to improve the availability of the services.
Examples: retry policies, caching, and rate limiting

7 - Security at all levels
In a microservices-based system, the attack surface is quite large. You must implement security at every level of the service communication path.

8 - Centralized logging
Logs are important to finding issues in a system. With multiple services, they become critical.

9 - Use containerization techniques
To deploy microservices in an isolated manner, use containerization techniques.
Tools like Docker and Kubernetes can help with this as they are meant to simplify the scaling and deployment of a microservice.

Over to you: what other best practice would you recommend?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Linux Boot Process Illustrated

A

Step 1 - When we turn on the power, BIOS (Basic Input/Output System) or UEFI (Unified Extensible Firmware Interface) firmware is loaded from non-volatile memory, and executes POST (Power On Self Test).

Step 2 - BIOS/UEFI detects the devices connected to the system, including CPU, RAM, and storage.

Step 3 - Choose a booting device to boot the OS from. This can be the hard drive, the network server, or CD ROM.

Step 4 - BIOS/UEFI runs the boot loader (GRUB), which provides a menu to choose the OS or the kernel functions.

Step 5 - After the kernel is ready, we now switch to the user space. The kernel starts up systemd as the first user-space process, which manages the processes and services, probes all remaining hardware, mounts filesystems, and runs a desktop environment.

Step 6 - systemd activates the default. target unit by default when the system boots. Other analysis units are executed as well.

Step 7 - The system runs a set of startup scripts and configures the environment.

Step 8 - The users are presented with a login window. The system is now ready.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does Visa make money?

A

Why is the credit card called “𝐭𝐡𝐞 𝐦𝐨𝐬𝐭 𝐩𝐫𝐨𝐟𝐢𝐭𝐚𝐛𝐥𝐞 product in banks”? How does VISA/Mastercard make money?

  1. The cardholder pays a merchant $100 to buy a product.
  2. The merchant benefits from the use of the credit card with higher sales volume, and needs to compensate the issuer and the card network for providing the payment service. The acquiring bank sets a fee with the merchant, called the “𝐦𝐞𝐫𝐜𝐡𝐚𝐧𝐭 𝐝𝐢𝐬𝐜𝐨𝐮𝐧𝐭 𝐟𝐞𝐞.”

3 - 4. The acquiring bank keeps $0.25 as the 𝐚𝐜𝐪𝐮𝐢𝐫𝐢𝐧𝐠 𝐦𝐚𝐫𝐤𝐮𝐩, and $1.75 is paid to the issuing bank as the 𝐢𝐧𝐭𝐞𝐫𝐜𝐡𝐚𝐧𝐠𝐞 𝐟𝐞𝐞. The merchant discount fee should cover the interchange fee. The interchange fee is set by the card network because it is less efficient for each issuing bank to negotiate fees with each merchant.

  1. The card network sets up the 𝐧𝐞𝐭𝐰𝐨𝐫𝐤 𝐚𝐬𝐬𝐞𝐬𝐬𝐦𝐞𝐧𝐭𝐬 𝐚𝐧𝐝 𝐟𝐞𝐞𝐬 with each bank, which pays the card network for its services every month. For example, VISA charges a 0.11% assessment, plus a $0.0195 usage fee, for every swipe.
  2. The cardholder pays the issuing bank for its services. Why should the issuing bank be compensated?
    ● The issuer pays the merchant even if the cardholder fails to pay the issuer.
    ● The issuer pays the merchant before the cardholder pays the issuer.
    ● The issuer has other operating costs, including managing customer accounts, providing statements, fraud detection, risk management, clearing & settlement, etc.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do we manage configurations in a system?
A comparison between traditional configuration management and IaC
(Infrastructure as Code).

A

● Configuration Management
The practice is designed to manage and provision IT infrastructure through systematic and repeatable processes. This is critical for ensuring that the system performs as intended. Traditional configuration management focuses on maintaining the desired state of the system’s configuration items, such as servers, network devices, and applications, after they have been provisioned.

It usually involves initial manual setup by DevOps. Changes are managed by step-by-step commands.

● What is IaC?
IaC, on the hand, represents a shift in how infrastructure is provisioned and managed,
treating infrastructure setup and changes as software development practices.
IaC automates the provisioning of infrastructure, starting and managing the system through
code. It often uses a declarative approach, where the desired state of the infrastructure is described.
Tools like Terraform, AWS CloudFormation, Chef, and Puppet are used to define
infrastructure in code files that are source controlled.
IaC represents an evolution towards automation, repeatability, an

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is CSS (Cascading Style Sheets)?

A

Front-end development requires not only content presentation, but also good-looking. CSS is a markup language used to describe how elements on a web page should be rendered.

▶️ What CSS does?
CSS separates the content and presentation of a document. In the early days of web development, HTML acted as both content and style.

CSS divides structure (HTML) and style (CSS). This has many benefits, for example, when we
change the color scheme of a web page, all we need to do is to tweak the CSS file.

▶️How CSS works?
CSS consists of a selector and a set of properties, which can be thought of as individual rules.
Selectors are used to locate HTML elements that we want to change the style of, and properties are.the specific style descriptions for those elements, such as color, size, position, etc. For example, if we want to make all the text in a paragraph blue, we write CSS code like this: p { color: blue; }
Here “p” is the selector and “color: blue” is the attribute that declares the color of the paragraph text to be blue.

▶️ Cascading in CSS
The concept of cascading is crucial to understanding CSS. When multiple style rules conflict, the browser needs to decide which rule to use based on a specific prioritization rule. The one with the highest weight wins. The weight can be determined by a variety of factors, including selector type and the order of the source.

▶️ Powerful Layout Capabilities of CSS
In the past, CSS was only used for simple visual effects such as text colors, font styles, or backgrounds. Today, CSS has evolved into a powerful layout tool capable of handling complex design layouts. The “Flexbox” and “Grid” layout modules are two popular CSS layout modules that make it easy to create responsive designs and precise placement of web elements, so web developers no longer have to rely on complex tables or floating layouts.

▶️ CSS Animation
Animation and interactive elements can greatly enhance the user experience.
CSS3 introduces animation features that allow us to transform and animate elements without using JavaScript. For example, “@keyframes” rule defines animation sequences, and the transition property can be used to set animated transitions from one state to another.

▶️ Responsive Design
CSS allows the layout and style of a website to be adapted to different screen sizes and resolutions, so that we can provide an optimized browsing experience for different devices such as cell phones, tablets and computers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Roadmap for Learning Cyber Security

A

Cybersecurity is crucial for protecting information and systems from theft, damage, and unauthorized access. Whether you’re a beginner or looking to advance your technical skills, there are numerous resources and paths you can take to learn more about cybersecurity. Here are some structured
suggestions to help you get started or deepen your knowledge:
🔹 Security Architecture
🔹 Frameworks & Standards
🔹 Application Security
🔹 Risk Assessment
🔹 Enterprise Risk Management
🔹 Threat Intelligence
🔹 Security Operation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How will you design the Stack Overflow website?

A

If your answer is on-premise servers and monolith (on the right), you would likely fail the interview, but that’s how it is built in reality!

𝐖𝐡𝐚𝐭 𝐩𝐞𝐨𝐩𝐥𝐞 𝐭𝐡𝐢𝐧𝐤 𝐢𝐭 𝐬𝐡𝐨𝐮𝐥𝐝 𝐥𝐨𝐨𝐤 𝐥𝐢𝐤𝐞
The interviewer is probably expecting something on the left side.
1. Microservice is used to decompose the system into small components.
2. Each service has its own database. Use cache heavily.
3. The service is sharded.
4. The services talk to each other asynchronously through message queues.
5. The service is implemented using Event Sourcing with CQRS.
6. Showing off knowledge in distributed systems such as eventual consistency, CAP theorem, etc.

𝐖𝐡𝐚𝐭 𝐢𝐭 𝐚𝐜𝐭𝐮𝐚𝐥𝐥𝐲 𝐢𝐬
Stack Overflow serves all the traffic with only 9 on-premise web servers, and it’s on monolith! It has its own servers and does not run on the cloud.
This is contrary to all our popular beliefs these days.
Over to you: what is good architecture, the one that looks fancy during the interview or the one that works in reality?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The one-line change that reduced clone times by a whopping 99%, says Pinterest

A

While it may sound cliché, small changes can definitely create a big impact.
The Engineering Productivity team at Pinterest witnessed this first-hand.
They made a small change in the Jenkins build pipeline of their monorepo codebase called
Pinboard.

And it brought down clone times from 40 minutes to a staggering 30 seconds.
For reference, Pinboard is the oldest and largest monorepo at Pinterest. Some facts about it:
- 350K commits
- 20 GB in size when cloned fully
- 60K git pulls on every business day
Cloning monorepos having a lot of code and history is time consuming. This was exactly what was happening with Pinboard.
The build pipeline (written in Groovy) started with a “Checkout” stage where the repository was cloned for the build and test steps.
The clone options were set to shallow clone, no fetching of tags and only fetching the last 50 commits.

But it missed a vital piece of optimization.
The Checkout step didn’t use the Git refspec option.

This meant that Git was effectively fetching all refspecs for every build. For the Pinboard monorepo, it meant fetching more than 2500 branches.

𝐒𝐨 - 𝐰𝐡𝐚𝐭 𝐰𝐚𝐬 𝐭𝐡𝐞 𝐟𝐢𝐱?
The team simply added the refspec option and specified which ref they cared about. It was the “master” branch in this case.
This single change allowed Git clone to deal with only one branch and significantly reduced the overall build time of the monorepo.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How does Javascript Work?

A

The cheat sheet below shows most important characteristics of Javascript.

🔹 Interpreted Language
JavaScript code is executed by the browser or JavaScript engine rather than being compiled into machine language beforehand. This makes it highly portable across different platforms. Modern engines such as V8 utilize Just-In-Time (JIT) technology to compile code into directly executable machine code.

🔹 Function is First-Class Citizen
In JavaScript, functions are treated as first-class citizens, meaning they can be stored in variables, passed as arguments to other functions, and returned from functions.

🔹 Dynamic Typing
JavaScript is a loosely typed or dynamic language, meaning we don’t have to declare a variable’s type ahead of time, and the type can change at runtime.

🔹 Client-Side Execution
JavaScript supports asynchronous programming, allowing operations like reading files, making HTTP requests, or querying databases to run in the background and trigger callbacks or promises when complete. This is particularly useful in web development for improving performance and user experience.

🔹 Prototype-Based OOP
Unlike class-based object-oriented languages, JavaScript uses prototypes for inheritance. This means that objects can inherit properties and methods from other objects.

🔹 Automatic Garbage Collection
Garbage collection in JavaScript is a form of automatic memory management. The primary goal of garbage collection is to reclaim memory occupied by objects that are no longer in use by the program, which helps prevent memory leaks and optimizes the performance of the application.

🔹 Compared with Other Languages
JavaScript is special compared to programming languages like Python or Java because of its position as a major language for web development. While Python is known to provide good code readability and versatility, and Java is known for its structure and robustness, JavaScript is an interpreted language that runs directly on the browser without compilation, emphasizing flexibility and dynamism.

🔹 Relationship with Typescript
TypeScript is a superset of JavaScript, which means that it extends JavaScript by adding features to the language, most notably type annotations. This relationship allows any valid JavaScript code to also be considered valid TypeScript code.

🔹 Popular Javascript Frameworks
React is known for its flexibility and large number of community-driven plugins, while Vue is clean and intuitive with highly integrated and responsive features. Angular, on the other hand, offers a strict set of development specifications for enterprise-level JS development.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How does gRPC work?

A

RPC (Remote Procedure Call) is called “𝐫𝐞𝐦𝐨𝐭𝐞” because it enables communications between
remote services when services are deployed to different servers under microservice architecture.
From the user’s point of view, it acts like a local function call.
The diagram below illustrates the overall data flow for 𝐠𝐑𝐏𝐂.

Step 1: A REST call is made from the client. The request body is usually in JSON format.

Steps 2 - 4: The order service (gRPC client) receives the REST call, transforms it, and makes an RPC call to the payment service. gPRC encodes the 𝐜𝐥𝐢𝐞𝐧𝐭 𝐬𝐭𝐮𝐛 into a binary format and sends it to the low-level transport layer.

Step 5: gRPC sends the packets over the network via HTTP2. Because of binary encoding and network optimizations, gRPC is said to be 5X faster than JSON.

Steps 6 - 8: The payment service (gRPC server) receives the packets from the network, decodes them, and invokes the server application.

Steps 9 - 11: The result is returned from the server application, and gets encoded and sent to the transport layer.

Steps 12 - 14: The order service receives the packets, decodes them, and sends the result to the client application.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How Netflix Really Uses Java?

A

Netflix is predominantly a Java shop.
Every backend application (including internal apps, streaming, and movie production apps) at Netflix is a Java application.
However, the Java stack is not static and has gone through multiple iterations over the years.

Here are the details of those iterations:

1 - API Gateway
Netflix follows a microservices architecture. Every piece of functionality and data is owned by a microservice built using Java (initially version 8)
This means that rendering one screen (such as the List of List of Movies or LOLOMO) involved
fetching data from 10s of microservices. But making all these calls from the client created a
performance problem.
Netflix initially used the API Gateway pattern using Zuul to handle the orchestration.

2 - BFFs with Groovy & RxJava
Using a single gateway for multiple clients was a problem for Netflix because each client (such as TV, mobile apps, or web browser) had subtle differences. To handle this, Netflix used the Backend-for-Frontend (BFF) pattern. Zuul was moved to the role of a proxy. In this pattern, every frontend or UI gets its own mini backend that performs the request fanout and orchestration for multiple services. The BFFs were built using Groovy scripts and the service fanout was done using RxJava for thread management.

3 - GraphQL Federation
The Groovy and RxJava approach required more work from the UI developers in creating the Groovy scripts. Also, reactive programming is generally hard. Recently, Netflix moved to GraphQL Federation. With GraphQL, a client can specify exactly what set of fields it needs, thereby solving the problem of overfetching and underfetching with REST APIs.

The GraphQL Federation takes care of calling the necessary microservices to fetch the data.
These microservices are called Domain Graph Service (DGS) and are built using Java 17, Spring Boot 3, and Spring Boot Netflix OSS packages. The move from Java 8 to Java 17 resulted in 20% CPU gains.

More recently, Netflix has started to migrate to Java 21 to take advantage of features like virtual threads.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

OSI Model
How is data sent over the network? Why do we need so many layers in the OSI model?

A

The diagram below shows how data is encapsulated and de-encapsulated when transmitting over the network.

Step 1: When Device A sends data to Device B over the network via the HTTP protocol, it is first added an HTTP header at the application layer.

Step 2: Then a TCP or a UDP header is added to the data. It is encapsulated into TCP segments at the transport layer. The header contains the source port, destination port, and sequence number.

Step 3: The segments are then encapsulated with an IP header at the network layer. The IP header contains the source/destination IP addresses.

Step 4: The IP datagram is added a MAC header at the data link layer, with source/destination MAC addresses.

Step 5: The encapsulated frames are sent to the physical layer and sent over the network in binary bits.

Steps 6-10: When Device B receives the bits from the network, it performs the de-encapsulation process, which is a reverse processing of the encapsulation process. The headers are removed layer by layer, and eventually, Device B can read the data.
We need layers in the network model because each layer focuses on its own responsibilities. Each layer can rely on the headers for processing instructions and does not need to know the meaning of the data from the last layer.

Over to you: Do you know which layer is responsible for resending lost data?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

8 Key Data Structures That Power Modern Databases

A

🔹Skiplist: a common in-memory index type. Used in Redis

🔹Hash index: a very common implementation of the “Map” data structure (or “Collection”)

🔹SSTable: immutable on-disk “Map” implementation

🔹LSM tree: Skiplist + SSTable. High write throughput

🔹B-tree: disk-based solution. Consistent read/write performance

🔹Inverted index: used for document indexing. Used in Lucene

🔹Suffix tree: for string pattern search

🔹R-tree: multi-dimension search, such as finding the nearest neighbor

25
What is DevSecOps?
DevSecOps emerged as a natural evolution of DevOps practices with a focus on integrating security into the software development and deployment process. The term "DevSecOps" represents the convergence of Development (Dev), Security (Sec), and Operations (Ops) practices, emphasizing the importance of security throughout the software development lifecycle. The diagram below shows the important concepts in DevSecOps. 1 . Automated Security Checks 2 . Continuous Monitoring 3 . CI/CD Automation 4 . Infrastructure as Code (IaC) 5 . Container Security 6 . Secret Management 7 . Threat Modeling 8. Quality Assurance (QA) Integration 9 . Collaboration and Communication 10 . Vulnerability Management
26
Change Data Capture: key to leverage real-time Data
90% of the world’s data was created in the last two years and this growth will only get faster. However, the biggest challenge is to leverage this data in real-time. Constant data changes make databases, data lakes, and data warehouses out of sync. CDC or Change Data Capture can help you overcome this challenge. CDC identifies and captures changes made to the data in a database, allowing you to replicate and sync data across multiple systems. So, how does Change Data Capture work? Here's a step-by-step breakdown: 1 - Data Modification: A change is made to the data in the source database. It could be an insert, update, or delete operation on a table. 2 - Change Capture: A CDC tool monitors the database transaction logs to capture the modifications. It uses the source connector to connect to the database and read the logs. 3 - Change Processing: The captured changes are processed and transformed into a format suitable for the downstream systems. 4 - Change Propagation: The processed changes are published to a message queue and propagated to the target systems, such as data warehouses, analytics platforms, distributed caches like Redis, and so on. 5 - Real-Time Integration: The CDC tool uses its sink connector to consume the log and update the target systems. The changes are received in real time, allowing for conflict-free data analysis and decision-making. Users only need to take care of step 1 while all other steps are transparent. A popular CDC solution uses Debezium with Kafka Connect to stream data changes from the source to target systems using Kafka as the broker. Debezium has connectors for most databases such as MySQL, PostgreSQL, Oracle, etc.
27
Top 6 ElasticSearch Use Cases.
Elasticsearch is widely used for its powerful and versatile search capabilities. The diagram below shows the top 6 use cases: 🔹 Full-Text Search Elasticsearch excels in full-text search scenarios due to its robust, scalable, and fast search capabilities. It allows users to perform complex queries with near real-time responses. 🔹 Real-Time Analytics Elasticsearch's ability to perform analytics in real-time makes it suitable for dashboards that track live data, such as user activity, transactions, or sensor outputs. 🔹 Machine Learning With the addition of the machine learning feature in X-Pack, Elasticsearch can automatically detect anomalies, patterns, and trends in the data. 🔹 Geo-Data Applications Elasticsearch supports geo-data through geospatial indexing and searching capabilities. This is useful for applications that need to manage and visualize geographical information, such as mapping and location-based services. 🔹 Log and Event Data Analysis Organizations use Elasticsearch to aggregate, monitor, and analyze logs and event data from various sources. It's a key component of the ELK stack (Elasticsearch, Logstash, Kibana), which is popular for managing system and application logs to identify issues and monitor system health. 🔹 Security Information and Event Management (SIEM) Elasticsearch can be used as a tool for SIEM, helping organizations to analyze security events in real time.
28
How do computer programs run?
🔹 User interaction and command initiation By double-clicking a program, a user is instructing the operating system to launch an application via the graphical user interface. 🔹 Program Preloading Once the execution request has been initiated, the operating system first retrieves the program's executable file. The operating system locates this file through the file system and loads it into memory in preparation for execution. 🔹 Dependency resolution and loading Most modern applications rely on a number of shared libraries, such as dynamic link libraries (DLLs). 🔹 Allocating memory space The operating system is responsible for allocating space in memory. 🔹 Initializing the Runtime Environment After allocating memory, the operating system and execution environment (e.g., Java's JVM or the .NET Framework) will initialize various resources needed to run the program. 🔹 System Calls and Resource Management The entry point of a program (usually a function named `main`) is called to begin execution of the code written by the programmer. 🔹 Von Neumann Architecture In the Von Neumann architecture, the CPU executes instructions stored in memory. 🔹 Program termination Eventually, when the program has completed its task, or the user actively terminates the application, the program will begin a cleanup phase. This includes closing open file descriptors, freeing up network resources, and returning memory to the system.
29
Netflix's Tech Stack
This post is based on research from many Netflix engineering blogs and open-source projects. If you come across any inaccuracies, please feel free to inform us. Mobile and web: Netflix has adopted Swift and Kotlin to build native mobile apps. For its web application, it uses React. Frontend/server communication: GraphQL. Backend services: Netflix relies on ZUUL, Eureka, the Spring Boot framework, and other technologies. Databases: Netflix utilizes EV cache, Cassandra, CockroachDB, and other databases. Messaging/streaming: Netflix employs Apache Kafka and Fink for messaging and streaming purposes. Video storage: Netflix uses S3 and Open Connect for video storage. Data processing: Netflix utilizes Flink and Spark for data processing, which is then visualized using Tableau. Redshift is used for processing structured data warehouse information. CI/CD: Netflix employs various tools such as JIRA, Confluence, PagerDuty, Jenkins, Gradle, Chaos Monkey, Spinnaker, Altas, and more for CI/CD processes.
30
Top 6 Cloud Messaging Patterns.
How do services communicate with each other? The diagram below shows 6 cloud messaging patterns. 🔹 Asynchronous Request-Reply. This pattern aims at providing determinism for long-running backend tasks. It decouples backend processing from frontend clients. In the diagram below, the client makes a synchronous call to the API, triggering a long-running operation on the backend. The API returns an HTTP 202 (Accepted) status code, acknowledging that the request has been received for processing. 🔹 Publisher-Subscriber This pattern targets decoupling senders from consumers, and avoiding blocking the sender to wait for a response. 🔹 Claim Check This pattern solves the transmision of large messages. It stores the whole message payload into a database and transmits only the reference to the message, which will be used later to retrieve the payload from the database. 🔹 Priority Queue This pattern prioritizes requests sent to services so that requests with a higher priority are received and processed more quickly than those with a lower priority. 🔹 Saga Saga is used to manage data consistency across multiple services in distributed systems, especially in microservices architectures where each service manages its own database. The saga pattern addresses the challenge of maintaining data consistency without relying on distributed transactions, which are difficult to scale and can negatively impact system performance. 🔹 Competing Consumers This pattern enables multiple concurrent consumers to process messages received on the same messaging channel. There is no need to configure complex coordination between the consumers. However, this pattern cannot guarantee message ordering.
31
Reddit’s Core Architecture that helps it serve over 1 billion users every month.
This information is based on research from many Reddit engineering blogs. But since architecture is ever-evolving, things might have changed in some aspects. The main points of Reddit’s architecture are as follows: 1 - Reddit uses a Content Delivery Network (CDN) from Fastly as a front for the application 2 - Reddit started using jQuery in early 2009. Later on, they started using Typescript and have now moved to modern Node.js frameworks. Over the years, Reddit has also built mobile apps for Android and iOS. 3 - Within the application stack, the load balancer sits in front and routes incoming requests to the appropriate services. 4 - Reddit started as a Python-based monolithic application but has since started moving to microservices built using Go. 5 - Reddit heavily uses GraphQL for its API layer. In early 2021, they started moving to GraphQL Federation, which is a way to combine multiple smaller GraphQL APIs known as Domain Graph Services (DGS). In 2022, the GraphQL team at Reddit added several new Go subgraphs for core Reddit entities thereby splitting the GraphQL monolith. 6 - From a data storage point of view, Reddit relies on Postgres for its core data model. To reduce the load on the database, they use memcached in front of Postgres. Also, they use Cassandra quite heavily for new features mainly because of its resiliency and availability properties. 7 - To support data replication and maintain cache consistency, Reddit uses Debezium to run a Change Data Capture process. 8 - Expensive operations such as a user voting or submitting a link are deferred to an async job queue via RabbitMQ and processed by job workers. For content safety checks and moderation, they use Kafka to transfer data in real-time to run rules over them. 9 - Reddit uses AWS and Kubernetes as the hosting platform for its various apps and internal services. 10 - For deployment and infrastructure, they use Spinnaker, Drone CI, and Terraform. Over to you: what other aspects do you know about Reddit’s architecture?
32
Top 9 Architectural Patterns for Data and Communication Flow
🔹 Peer-to-Peer The Peer-to-Peer pattern involves direct communication between two components without the need for a central coordinator. 🔹 API Gateway An API Gateway acts as a single entry point for all client requests to the backend services of an application. 🔹 Pub-Sub The Pub-Sub pattern decouples the producers of messages (publishers) from the consumers of messages (subscribers) through a message broker. 🔹 Request-Response This is one of the most fundamental integration patterns, where a client sends a request to a server and waits for a response 🔹 Event Sourcing Event Sourcing involves storing the state changes of an application as a sequence of events. 🔹 ETL ETL is a data integration pattern used to gather data from multiple sources, transform it into a structured format, and load it into a destination database. 🔹 Batching Batching involves accumulating data over a period or until a certain threshold is met before processing it as a single group. 🔹 Streaming Processing Streaming Processing allows for the continuous ingestion, processing, and analysis of data streams in real-time. 🔹 Orchestration Orchestration involves a central coordinator (an orchestrator) managing the interactions between distributed components or services to achieve a workflow or business process.
33
Core Principles of Solution Architecture Design
Great architecture isn’t just about solving today’s problems—it’s about preparing for growth and change. Here are 7 foundational principles with actionable best practices to guide your designs: 1) Scalable – Design systems to handle increased load via horizontal scaling, auto-scaling, and distributed state management. 2) Highly available & resilient – Ensure uptime and fast recovery using failover strategies, redundancy, and data synchronization. 3) Performant – Optimize for low latency and high throughput using async processing, caching, and monitoring worst-case latencies (e.g., p99). 4) Secure – Bake in security from the start: use encryption, RBAC, OAuth2/JWT, and consider Zero Trust Architecture. 5) Loosely coupled – Enable flexibility and fault tolerance with modular design and event-driven messaging systems like Kafka or RabbitMQ. 6) Extensible – Support future growth by following the Open-Closed Principle and building backward-compatible APIs. 7) Reusable – Boost development speed with composable components, shared libraries, and domain-driven design (DDD). Applying these principles leads to systems that are scalable, maintainable, and ready for whatever comes next.
34
Stateful vs Stateless Design
Stateless design is a powerful model that has led to the development of simple yet highly scalable and efficient applications. “State” refers to stored information that systems use to process requests. This information can change over time as users interact with the application or as system events occur. What are Stateful Applications? With stateful applications, client data such as user ID, session information, configurations, and preferences are stored to help process requests for a given user. Depending on the functionality and requirements of the application, additional data may be saved, such as shopping cart information for an online store or transaction history for a FinTech service. A stateful design allows applications to provide a personalized experience to their users while removing the need to share data across multiple requests. For this reason, it is a popular approach for applications with user preferences such as streaming services and online games. systems use to process requests. This information can change over time as users interact with the application or as system events occur. Use Cases for Stateless Design Stateless design has risen in popularity due to its alignment with trends in modern computing such as serverless architecture and microservices. One of the key principles behind microservices is that each service is stateless. This allows microservices to scale independently and ensures resource consumption stays efficient. Serverless computing follows the same concept — each function is invoked independently. Even applications that require session management can benefit from implementing a stateless design in components of their system. For example, most RESTful APIs are stateless where each API call contains all necessary information. CDNs also follow a stateless design so that every request can be fulfilled by any server in the network without needing to sync session data between all servers or query a single session management store. Disadvantages of Stateless Design The size of requests can be considerably larger in stateless design. Moreover, sending data across multiple requests can introduce significant inefficiencies that are far greater than the alternative of managing and querying this data from a central storage system. It is important to note that stateless design should only be implemented for use cases that are truly stateless. Although stateful design has its share of disadvantages, workarounds can add more complexity and fragility. Final Thoughts Most applications pick a hybrid approach between stateful and stateless design, depending on the needs and constraints of each component. The key to a well-designed system is balance. It should be scalable, simple, and fast without sacrificing functionality.
35
How AI Speeds Up the Software Development Cycle
AI isn’t replacing engineers—instead it’s making us faster. AI coding tools are making developers faster, especially during the coding stage. But most tools stop there. GitLab Duo goes further—bringing AI to every stage of the SDLC, from design to deployment: System Design & Architecture: AI suggests improvements and flags vulnerabilities early. Code Development: Real-time autocomplete, refactoring tips, and code explanations. Testing & QA: Auto-generates tests and identifies security risks. Deployment & Release: Enhances CI/CD with smart summaries and troubleshooting. Security & Compliance: Self-hosted option ensures enterprise-grade control over data.
36
Engineering Profit: A Deep Dive into Dynamic Pricing Systems
What do Uber, Amazon, airlines, and Facebook all have in common? Dynamic pricing. It’s a major function of their business models. And it’s not just them either. Dynamic pricing is a key strategy for a lot of companies. Maybe even the company you’re currently working for. Behind dynamic pricing sits a system that leverages software engineering, data science, and business strategy to help companies achieve business outcomes. Dynamic Pricing in Practice Dynamic pricing is a key strategy for businesses to increase profitability. It enables companies to maximize revenue during periods of high demand by increasing pricing while demand is high. In times of low demand, dynamic pricing allows companies to lower prices to stimulate demand and maintain a consistent revenue stream. This ensures that they generate sales even during periods of lower demand. It also helps businesses conduct competitive pricing by allowing them to make adjustments based on competitors' strategies. And for businesses with physical products, dynamic pricing is essential to inventory management. It helps in optimizing stock levels, consequently minimizing potential losses. Depending on the business’s needs and system, the above changes in price are often in real-time or near real-time. I did say that Uber, Amazon, airlines, and Facebook all use dynamic pricing. Here’s how: Uber adjusts fares based on rider demand and driver availability, leading to higher prices during peak times. Amazon changes product prices frequently, considering competition, demand, and availability. Airlines vary ticket prices based on booking time, demand, and seat availability. Facebook uses dynamic pricing for its advertising slots, with costs fluctuating based on demand, ad placement, and competition. And it’s not just these companies. Through these examples, you can see how dynamic pricing is actually used by a lot of companies. By understanding the business context, you’re now better equipped as an engineer to strategize and implement the technical solution, helping achieve business outcomes. Understanding the Architecture of Dynamic Pricing Systems Dynamic pricing carefully balances market demand, competitor pricing and inventory levels while adjusting prices in real or near-real time. Beyond evaluating these factors, engineers must also build systems that can process large datasets quickly and automate scalable pricing decisions. Dynamic pricing generally follows the ELT process (extract, transform, load). It begins with data collection and analysis, capturing real-time sales, customer behavior, inventory status and competitor prices. Data pipelines must be created to handle this influx from diverse sources, ensuring both the precision and speed necessary for the pricing algorithms. After collection, efficient data storage is necessary for quick querying and analysis. To improve real-time analytics performance, data warehouses should be optimized for analytical queries and scalable to accommodate data growth, as well as caching mechanisms and data indexing. Moving from storage to decision-making, well-designed algorithms are the backbone of any efficient dynamic pricing strategy. Many approaches can be taken, here are some of the most notable: Rule-based systems This is the simplest approach where price adjustments are based on predefined criteria. Time-series forecasting This approach analyzes historical data to set future pricing. ARIMA (AutoRegressive Integrated Moving Average) and Prophet (a forecasting tool developed by Facebook) are commonly used for time-series forecasting. The challenge here is to accurately model and forecast pricing trends, taking into account seasonal variations and market shifts. Machine learning models Regression models, decision trees, and neural networks are often used to determine pricing based on historical information. Multi-armed Bandit Algorithms When you have multiple pricing strategies available, multi-armed bandit algorithms can be used to determine which one will provide the most revenue. Challenges and Considerations Crafting dynamic pricing systems comes with its own set of challenges. Key among these is the need for scalability, ensuring the system can manage large data volumes, particularly during peak times. Preserving data privacy and security is equally important while requiring systems to comply with legal and ethical standards. Additionally, the design must consider how swift changes in pricing could influence customer perceptions and the overall brand reputation. Each of these factors plays a crucial role in the successful implementation and operation of a dynamic pricing strategy. Wrapping Up Twenty years ago, dynamic pricing systems were rarely seen. Nowadays, a lot of companies have them. The growing adoption and continuous development of dynamic pricing systems highlight the growing role of engineering in modern business.
37
Top 8 Cache Eviction Strategies.
🔹 LRU (Least Recently Used) LRU eviction strategy removes the least recently accessed items first. This approach is based on the principle that items accessed recently are more likely to be accessed again in the near future. 🔹 MRU (Most Recently Used) Contrary to LRU, the MRU algorithm removes the most recently used items first. This strategy can be useful in scenarios where the most recently accessed items are less likely to be accessed again soon. 🔹 SLRU (Segmented LRU) SLRU divides the cache into two segments: a probationary segment and a protected segment. New items are initially placed into the probationary segment. If an item in the probationary segment is accessed again, it is promoted to the protected segment. 🔹 LFU (Least Frequently Used) LFU algorithm evicts the items with the lowest access frequency. 🔹 FIFO (First In First Out) FIFO is one of the simplest caching strategies, where the cache behaves in a queue-like manner, evicting the oldest items first, regardless of their access patterns or frequency. 🔹 TTL (Time-to-Live) While not strictly an eviction algorithm, TTL is a strategy where each cache item is given a specific lifespan. 🔹 Two-Tiered Caching In Two-Tiered Caching strategy, we use an in-memory cache for the first layer and a distributed cache for the second layer. 🔹 RR (Random Replacement) Random Replacement algorithm randomly selects a cache item and evicts it to make space for new items. This method is also simple to implement and does not require tracking access patterns or frequencies.
38
39
40
41
How do you pay from your digital wallet by scanning the QR code?
To understand the process involved, we need to divide the “scan to pay” process into two sub-processes: 1. Merchant generates a QR code and displays it on the screen 2. Consumer scans the QR code and pays Here are the steps for generating the QR code: 1. When you want to pay for your shopping, the cashier tallies up all the goods and calculates the total amount due, for example, $123.45. The checkout has an order ID of SN129803. The cashier clicks the “checkout” button. 2. The cashier’s computer sends the order ID and the amount to PSP. 3. The PSP saves this information to the database and generates a QR code URL. 4. PSP’s Payment Gateway service reads the QR code URL. 5. The payment gateway returns the QR code URL to the merchant’s computer. 6. The merchant’s computer sends the QR code URL (or image) to the checkout counter. 7. The checkout counter displays the QR code. These 7 steps complete in less than a second. Now it’s the consumer’s turn to pay from their digital wallet by scanning the QR code: 1. The consumer opens their digital wallet app to scan the QR code. 2. After confirming the amount is correct, the client clicks the “pay” button. 3. The digital wallet App notifies the PSP that the consumer has paid the given QR code. 4. The PSP payment gateway marks this QR code as paid and returns a success message to the consumer’s digital wallet App. 5. The PSP payment gateway notifies the merchant that the consumer has paid the given QR code.
42
43
44
45
46
Serverless Architecture Demystified: Strategies for Success and Pitfalls to Avoid
The Essence of Serverless Computing Serverless computing abstracts server management tasks from the development team’s workload. Instead, it relies on Functions-as-a-Service (FaaS) to handle event-triggered code execution. With this setup, cloud providers can allocate resources dynamically and only charge for the actual compute time used instead of reserved capacity. Serverless architectures can support a wide range of applications, from simple CRUD operations to complex, event-driven data processing workflows. It fosters a focus on code and functionality, streamlining the deployment of applications that can automatically adapt to fluctuating workloads. Key Practices To completely take advantage of serverless architectures, here are some best practices: Design for failure Ensuring your application can effectively handle failures is essential in a serverless setup. Strategies like retry mechanisms and circuit breakers can help maintain reliability and availability. Optimize for performance Serverless performance optimization has two goals: reduce cold start latency and maximize resource utilization. Lightweight functions, programming language selection, and aligning memory and computing resources with function requirements can all help to reduce startup times and costs. Security considerations A proactive approach to security is a must. To protect your serverless applications, implement the least privilege principle, secure your API gateways, and encrypt data. Cost management Despite being cost-effective, improper utilization can result in increased costs. Monitor usage patterns and adjust resource allocations to keep the expenses under control. Navigating Pitfalls While the above practices yield results, there are also common pitfalls to be mindful of: Ignoring cold start latency The user experience can be significantly impacted by cold starts. Reduce them by using warm-up techniques and optimizing your code. Overlooking security in a shared environment Avoid being taken in by the convenience of serverless computing and allowing complacency to creep in. Inadequate function permissions and neglecting data encryption are common oversights. Ensure that robust security measures are in place. Complexity in managing multiple services The granular nature of serverless can result in architectural complexity, particularly when integrating multiple services and functions. Adopting Infrastructure as Code (IaC) and serverless frameworks streamline management. Limited control and vendor lock-in Dependence on a single cloud provider can limit your control and flexibility. Serverless solutions should be evaluated for flexibility and portability to ensure they align with long-term architectural goals. When and Where Going Serverless Makes Sense Serverless excels with event-driven applications due to its reactive execution model. For microservices, it enables independent scaling and deployment. It also works well for projects with fluctuating traffic through automatic, efficient scaling. It's ideal for rapid development, allowing focus on coding over infrastructure management. And the pay-as-you-go model also can be well-suited for cost-sensitive projects. However, serverless architecture generally doesn’t fit well with long-running tasks due to execution time limits. Applications requiring low latency can suffer because of potential cold start delays. And cases needing precise environmental control may not be a great fit as it offers limited infrastructure customization. Assess your project's specific needs; performance, costs, scalability, and so on, to determine if serverless aligns with the project goals. Wrapping Up Serverless architectures have simplified server management. It has enabled developers to focus more on code and functionality rather than managing infrastructure. Despite its benefits, navigating serverless computing requires an understanding of its complexities and limitations. By adhering to best practices and being mindful of potential pitfalls, developers can leverage serverless technologies to build scalable, cost-efficient, and resilient applications.
47
48
What do Amazon, Netflix, and Uber have in common?
They are extremely good at scaling their system whenever needed. Here are 8 must-know strategies to scale your system. 1 - Stateless Services Design stateless services because they don’t rely on server-specific data and are easier to scale. 2 - Horizontal Scaling Add more servers so that the workload can be shared. 3 - Load Balancing Use a load balancer to distribute incoming requests evenly across multiple servers. 4 - Auto Scaling Implement auto-scaling policies to adjust resources based on real-time traffic. 5 - Caching Use caching to reduce the load on the database and handle repetitive requests at scale. 6 - Database Replication Replicate data across multiple nodes to scale the read operations while improving redundancy. 7 - Database Sharding Distribute data across multiple instances to scale the writes as well as reads. 8 - Async Processing Move time-consuming and resource-intensive tasks to background workers using async processing to scale out new requests
49
50
51
52
53
54
55
56
57
What Is gRPC and When Should You Use It?
gRPC is a powerful remote procedure call (RPC) framework developed by Google, enabling efficient and fast communication between services. It is built on HTTP/2 and Protocol Buffers. This is where a lot of the benefits of gRPC are derived from; Harnessing Protocol Buffers (Protbufs) as its interface definition language (IDL) helps alleviate tech stack lock-in. Each service can be written in any popular programming language, the IDL works across them all. The compact binary format of Protobufs provides faster serialization/deserialization, and a smaller payload than JSON. Less data sent means better performance. Since Protobufs are strongly typed, they provide type safety, which can eliminate many potential bugs. Utilizing HTTP/2 suppers bidirectional streaming and reduced latency for real-time data transmission. The combination of HTTP/2 and Protobufs provides maximum throughput and minimal latency. It's often faster than the traditional JSON over HTTP approach. The easy implementation and benefits above have made gRPC very popular for microservices communication.
58
SemVer Explained in Very Simple Terms
Semantic versioning is a standardized way to communicate software upgrades. It categorizes changes into three buckets: 🔴 Major: Contains breaking changes that require users to upgrade their code or integration. 🟢 Minor: Changes are backward-compatible. Typically extends functionality or improves performance. 🟣 Patch: Contains bug fixes that don’t change existing functionality. Pro tip: A simplified framework for thinking about SemVer is “Breaking.Feature.Fix”. SemVer provides an easy and clear way to communicate changes in software, which helps manage dependencies, plan releases, and troubleshoot problems.
59