DBMS Flashcards
What is OLTP and OLAP?
OLTP stands for Online Transaction Processing. It is an online transactional system that tracks day to day transactions of an organisation, typically banks, sales and trading. It manages database modification.
OLAP stands for Online Analytical Processing. It is an online retrieving and data analysis system.
Compare the focus of OLTP and OLAP
OLTP focuses on INSERT, UPDATE and DELETE information OLAP focuses on extracting data and analysing them for decision making
Compare data sources for OLTP and OLAP
The data source for OLTP is OLTP itself and its transactions
The data source for OLAP is different OLTP databases
Compare transaction length between OLTP and OLAP
OLTP Has short transactions
OLAP has long transactions
Compare queries in OLTP and OLAP
Queries in OLTP are simpler
Queries in OLAP are more complex
How is data in OLTP and OLAP normalised?
data in OLTP is normalised to 3NF
data in OLAP is not normalised
What is the difference regarding integrity constraints between OLTP and OLAP?
data in OLTP must frequently maintain data integrity constraint
data in OLAP does not get frequently modified. Hence, data integrity is not affected
How does OLTP and OLAP store data?
OLTP: traditional DBMS
OLAP: Data warehouse
Who uses OLTP and OLAP?
OLTP: Clerks and IT Professionals
OLAP: Knowledge workers
Compare the number of users between OLTP and OLAP?
OLTP: Thousands of users
OLAP: Hundreds of users
What is a data warehouse?
A data warehouse is a centralised repository designed to store, organised and manage large amounts of structured data from various sources. It is built to support complex querying and analysis, which are essential for business intelligence activities, decision-making and reporting. The warehouse is structured in a way that data from different domains, such as sales, marketing, HR or finance can be correlated and analysed together.
Key features:
Integration of Data: It integrates data from various sources and formats, ensuring consistency across different data types and systems.
Orthogonal Data Dimensions: It provides a multi-dimensional view of data, which includes dimensions like time (historical and current data) and subject areas (like sales and HR).
Objective: The primary objective is to facilitate reporting and data analysis, providing support for OLAP operations and helping organizations to make data-driven decisions.
Business Intelligence: It is a key component of business intelligence frameworks, offering a comprehensive view of an organization’s data.
Architecture: The architecture of a data warehouse often includes an operational layer of source systems, an integration layer where ETL (Extract, Transform, Load) or ELT (Extract, Load, Transform) processes and data cleansing occur, and the data warehouse itself where the processed data is stored.
Compare DBMS and Data Warehouse
DBMS is tuned for OLTP
Data Warehouse is tuned for OLAP
DBMS is used to run a business
Data Warehouse is used to optimise a business
What are two forms of data loading, which is a key topic of data warehouse
ELT and ETL
What is ETL?
ETL Stands for extract, transform and load. It is a process to move raw data from one or more sources into a data warehouse, where it can be stored, queried and analysed.
Extract: The first step is to pull the data from different sources. The data extracted can be structured or unstructured and may come in different formats.
Transform: The next step is where data is cleansed, enriched and transformed into a suitable format for analysis.
Data cleansing to correct or remove corrupt or inaccurate records.
Data mapping to ensure that data from one source fits into the destination.
Data conversion for standardizing formats, such as dates and numerical values.
Joining or splitting data fields, such as combining first and last names into a full name field or vice versa.
Aggregation or summarization of data, such as calculating total sales.
Load: The last step is to load the transformed data into the DW.
This ETL process is critical to make sure we have a unified system where data is in a suitable format for comprehensive analysis and Business Intelligence.
What are the 3 layers involved in data warehousing?
- Staging Layer: This is where we store raw data from the different sources. It is a temporary storage area used for data processing during the ETL process.
- Integration Layer: This layer is responsible for combining data from the staging layer and organising it into a structure suitable for analysis and query in DW. Data is organised into hierarchical groups or dimensions, and facts (quantitative data) are separated from these dimensions. The schema used in data warehouse typically includes facts and dimensions, which are used to build star or snowflake schemas for organising data into a multidimensional DB structure
- Access layer: This layer is the front end that users interact with to retrieve data from data warehouse. it includes tools for reporting, querying and data analysis.