Database Design Patterns Flashcards
Database Per Service
Summary:
One database/server/schema/private tables per microservice. Data should only be accessed through API. Eventual data consistency is goal of system.
Detail:
One database per microservice; it must be private to that service only. It should be accessed by the microservice API only. It cannot be accessed by other services directly. For example, for relational databases, we can use private-tables-per-service, schema-per-service, or database-server-per-service.
Design a single database per service
The database is private to the microservice. Data is only accessed through the microservice API.
Variants of this pattern for relational databases include private-tables-per-service, schema-per-service, or database-server-per-service. Regardless of the details, any solution should prevent access from other service tables and keep each service’s data private.
Keep scalability, high availability and disaster recovery requirements in mind.
https://docs.microsoft.com/en-us/azure/architecture/microservices/design/data-considerations
Shared Database Per Service
Summary:
Not ideal for greenfield microservice solutions. Impedes ability to evolve service needs without affecting other services relying on shared data. Brownfield microservice transformations can leverage this model while a better understanding of domains and data granularity is obtained. Note: Services within a single domain often communicate better with synchronous communication patterns like REST. Communication between domains is better accomplished via asynchronous communications (e.g. events).
Detail:
An anti-pattern for microservices, but if the application is a monolith and needs decomposition, denormalization is not that easy. This should be treated as an interim phase for brownfield applications. This should not be applied for greenfield applications.
When decomposing monoliths:
It often takes several attempts to find the right level of granularity for microservices.
Decomposing code is much easier than decomposing data. Decomposing data can be very difficult. You only want to do this once if possible.
Consider using the Shared Database pattern when decomposing a monolith — but only until the code is stable. When the code is stable, decompose the database and move to a Single Database Per Service pattern.
Adopting a microservice architecture comes with technical and non-technical dangers. These pro tips hopefully provide sound advice in addressing them:
Pro tip #1: Budget issues. Every company has competing budget priorities. When you deploy your services with a shared database the system will be “working” — and probably working better than the old monolithic architecture did. Business owners will be happy with the improvements and also happy to spend your remaining budget elsewhere before you can decompose your data. You run the risk of ending up with the dreaded Distributed Monolith.
Pro tip #2: Avoid Distributed Monoliths. Microservices are meant to provide a company with the highest degree of flexibility in building new features and responding to customer demand. This is one reason why it is important each service be independently deployable. On the other hand, even the smallest change to a monolith requires testing and deploying the entire codebase — a time consuming and error-prone endeavor.
A Distributed Monolith is created by tight coupling between microservices. Even though code is distributed across multiple services, the system acts like a monolith in that a change to any one service requires deploying all or nearly all the other services. Distributed Monoliths couple the complexity inherent in microservices (different code bases, different build pipelines and deployment models) with the inflexibility of a Monolith. A Distributed Monolith is literally the worst of both worlds.
Pro Tip #3: Use Allow Lists. Microservices interact with each other through defined APIs and communication protocols. As long as a service adheres to these standards, everything else becomes an implementation detail. A team is free to use the language, database, development processes and even DevOps pipelines of their choice.
This freedom is very valuable and a large benefit of microservice architecture. Left unchecked, this freedom can become a maintenance nightmare as each team implements their own favorite solution to every problem.
Many companies solve this problem by maintaining an allowed list of approved tools in each category. For example, MySQL or MariaDB for relational databases, Mongo or Couchbase for NoSQL solutions, H2 and Redis for in-memory DB, etc.
Pro Tip #4: Domain Modeling is critical to your success. A single microservice is like a neuron in the human brain — interesting in itself, but the real magic happens when neurons and microservices act in concert to produce thought or a scalable, self-healing application.
Neurons communicate with one another in identifiable patterns that are optimized for human cognition. Like neurons, microservices also have optimal communication patterns that are critical to an application’s performance and success. Services within a single domain often communicate better with synchronous communication patterns like REST. Communication between domains is often better accomplished through asynchronous eventing schemes.
Command Query Responsibility Segregation (CQRS)
Summary:
Not needed if domain / business rules are simple or a simple CRUD interface is sufficient.
Commands (Create, Update, Delete) are task based and asynchronous (placed in a queue).
Queries never modify the database but return DTOs (Data Transfer Objects) in a materialized view (avoid join complexity).
Pattern generally assumes separate and/or multiple read stores that are optimized for their respective clients.
Therefore pure pattern requires some form of data ETL or Read Replica or Eventual Consistency mechanism to update read store data
Pattern best applied to portions of a system (e.g. bounded contexts)
Detail:
Once we implement database-per-service, there is a requirement to query, which requires joint data from multiple services. it’s not possible. CQRS suggests splitting the application into two parts — the command side and the query side.
The command side handles the Create, Update, and Delete requests
The query side handles the query part by using the materialized views
The event sourcing pattern is generally used along with it to create events for any data change. Materialized views are kept updated by subscribing to the stream of events.
https: //martinfowler.com/bliki/CQRS.html
https: //docs.microsoft.com/en-us/azure/architecture/patterns/cqrs#:~:text=CQRS%20stands%20for%20Command%20and,operations%20for%20a%20data%20store.
Event Sourcing
Summary:
A complex pattern to address standard CRUD based systems that suffer from performance and lack of auditable changes. Event sourcing stores changes in an AppendOnly store (no deletes). These events, when paired with CQRS pattern can publish these events so that consumers can update their respective read stores. Data auditing is built in feature, as in ability to replay events and clients read needs are decoupled from CUD.
Detail:
Most applications work with data, and the typical approach is for the application to maintain the current state of the data by updating it as users work with it. For example, in the traditional create, read, update, and delete (CRUD) model a typical data process is to read data from the store, make some modifications to it, and update the current state of the data with the new values—often by using transactions that lock the data.
The CRUD approach has some limitations:
CRUD systems perform update operations directly against a data store, which can slow down performance and responsiveness, and limit scalability, due to the processing overhead it requires.
In a collaborative domain with many concurrent users, data update conflicts are more likely because the update operations take place on a single item of data.
Unless there’s an additional auditing mechanism that records the details of each operation in a separate log, history is lost.
The Event Sourcing pattern defines an approach to handling operations on data that’s driven by a sequence of events, each of which is recorded in an append-only store. Application code sends a series of events that imperatively describe each action that has occurred on the data to the event store, where they’re persisted. Each event represents a set of changes to the data (such as AddedItemToOrder).
The events are persisted in an event store that acts as the system of record. The event store typically publishes these events so that consumers can be notified and can handle them if needed. Consumers could, for example, initiate tasks that apply the operations in the events to other systems, or perform any other associated action that’s required to complete the operation. Notice that the application code that generates the events is decoupled from the systems that subscribe to the events.
Typical uses of the events published by the event store are to maintain materialized views of entities as actions in the application change them, and for integration with external systems. For example, a system can maintain a materialized view of all customer orders that’s used to populate parts of the UI. As the application adds new orders, adds or removes items on the order, and adds shipping information, the events that describe these changes can be handled and used to update the materialized view.
https://docs.microsoft.com/en-us/azure/architecture/patterns/event-sourcing
Saga Pattern
Summary:
A sequence of local transactions. Business transactions that span multiple services and 2PC is not an option. Each service updates their respective database and publishes a message or event. There are two ways to coordinate sagas:
Choreography - Each local transaction publishes domain events (web hook, websocket, etc) that trigger local transactions in other services (service responsibility)
Benefits: Good for simple workflows with few participants, no single point of failure (orchestrator), doesn’t require add’l service
implementation
Drawbacks: Difficult to track which services listen to what, risk of cyclic (circular) dependency, difficult integration testing since requires all
services to be running
Orchestration - An orchestrator object/service tells participants what local transactions to execute. Benefits: Good for complex workflows, doesn't introduce cyclic dependencies, participants don't need to know about commands for other participants. Drawbacks: Additional design complexity that requires implementing coordination logic, there is an additional point of failure with orchestrator.
Detail:
When each service has its own database and a business transaction spans multiple services, how do we ensure data consistency across services? For example, for an e-commerce application where customers have a credit limit, the application must ensure that a new order will not exceed the customer’s credit limit. Since Orders and Customers are in different databases, the application cannot simply use a local ACID transaction.
A Saga represents a high-level business process that consists of several sub requests, which each update data within a single service. Each request has a compensating request that is executed when the request fails. It can be implemented in two ways:
Choreography — When there is no central coordination, each service produces and listens to another service’s events and decides if an action should be taken or not.
Orchestration — An orchestrator (object) takes responsibility for a saga’s decision making and sequencing business logic.
https: //microservices.io/patterns/data/saga.html
https: //docs.microsoft.com/en-us/azure/architecture/reference-architectures/saga/saga