Random Flashcards
What does OLTP stand for?
Online Transactional Processing
What does OLAP stand for?
Online Analytical Processing
What is a data warehouse primarily used for?
- Business Intelligence
- Analytics
- Reporting
What is a data lake primarily used for?
- Big Data
- Machine Learning
What type of data is stored in a data warehouse?
Structured
What type of a data is stored in a data lake?
Structured, semi-structured, unstructured
What is a data lakehouse primarily used for?
Combines the querying capabilities of a data warehouse with the schema-less nature and high storage capacity of data lakes.
What does ACID stand for?
Atomicity, Consistency, Isolation, and Durability
What does CRUD stand for?
Create, Read, Update, Delete
What does Atomicity in ACID mean?
Ensures that all operations in a transaction are fully completed.
What does Consistency in ACID mean?
Guarantees the database remains valid before and after a transaction.
What does Isolation in ACID mean?
Ensures transactions run independently without affecting each other’s outcome.
What does Durability in ACID mean?
Once a transaction is committed, it stays permanent.
What is the 1NF (First Normal Form)?
All records have the same structure (same number of fields) and each field contains a single value, with no repeating groups of data.
What is the 2NF (Second Normal Form)?
A non-key field must provide a fact about the key, us the whole key, and nothing but the key.
What is the 3NF (Third Normal Form)?
Non-key fields only depend (describe) the primary key and not other non-key fields.
What is data governance?
The overall management of data availability, usability, integrity, and security.
What is sharding?
Distributing data across multiple database instances or servers.
What are the benefits of sharding?
Can handle high volumes of traffic and allows for parallel processing.
What does the acronym CAP (CAP theorem) stand for?
Consistency, Availability, Partition Tolerance
What does the CAP theorem state?
In a distributed data store, you can only guarantee two of the three properties.
What is data modeling?
The process of creating a conceptual representation of data and its relationships within a database.
What is data transformation?
The process of converting data from its original format into a format suitable for analysis or processing.
What are common techniques used in data transformation?
- Data cleansing
- Data normalization
- Aggregation
- Data mapping