Describe common data workloads Flashcards
Transactional Workloads
A transactional system records transactions that encapsulate specific events that the organization wants to track. Think of a transaction as a small, discrete, unit of work.
Transactional systems are often high-volume, sometimes handling many millions of transactions in a single day. The data being processed has to be accessible very quickly. The work performed by transactional systems is often referred to as Online Transactional Processing (OLTP). Examples include e-commerce transactions and banking operations.
OLTP (Online Transaction Processing)
OLTP solutions rely on databases optimized for efficient read and write operations to support transactional workloads involving Create, Read, Update, and Delete (CRUD) operations. These operations adhere to ACID semantics:
-Atomicity: Transactions are treated as a single unit, either succeeding completely or failing entirely.
-Consistency: Transactions transition the database from one valid state to another. Using the fund transfer example, the completed transaction must accurately reflect the transfer of funds between accounts.
-Isolation: Concurrent transactions don’t interfere with each other, ensuring a consistent database state. For example, while one transaction transfers funds, another checking account balances must provide consistent results, avoiding values from different stages of the transfer.
-Durability: Once committed, a transaction remains committed. After completing a fund transfer, the updated account balances persist even if the database is switched off.
-Single data source
-Many/Short transactions
-Latency sensitive
-Small payloads
Analytical Workloads
Analytical workloads involve complex queries and aggregations performed on large datasets. The goal is to gain insights and perform data analysis.
Use Cases: Online Analytical Processing (OLAP) systems, data warehousing, and business intelligence applications. Examples include data mining, reporting, and trend analysis.
OLAP (Online Analytical Processing)
An OLAP model is an aggregated type of data storage that is optimized for analytical workloads. Data aggregations are across dimensions at different levels, enabling you to drill up/down to view aggregations at multiple hierarchical levels; for example to find total sales by region, by city, or for an individual address.
-Because OLAP data is pre-aggregated, queries to return the summaries it contains can be run quickly.
Different types of user might perform data analytical work at different stages of the overall architecture. For example:
-Data scientists might work directly with data files in a data lake to explore and model data.
-Data Analysts might query tables directly in the data warehouse to produce complex reports and visualizations.
-Business users might consume pre-aggregated data in an analytical model in the form of reports or dashboards.
-Multiple data sources
-Long/Few transactions
-Throughput sensitive
-Large payloads