ETL Flashcards
Define Extraction
Extraction is to collect data from multiple targeted
sources as SQL or NoSQL databases, cloud platforms or
XML file
Why Extraction is the most complicated task in the ETL ?
because many sources are in a way that lacks
the quality or quantity required (unsatisfactorily), and
determining the eligibility for extraction is not an easy
process
What are the two types of extraction ?
logical extraction and
physical extraction
What are the two kinds of logical extraction ?
Full extraction and
Incremental extraction
What’s the difference between the two of them ?
Full extraction is used
when the system can’t identify which data is updated whereas Incremental extraction is used to extract
and load only new or changed parts not the whole data
What are the two kinds of physical extraction ?
Online extraction and Offline extraction
What’s the difference between the two kinds of physical extraction ?
In online extraction,data is extracted directly from
source systems while in Offline extraction,data isn’t extracted directly from source systems, first, it’s copied to an external file, then our extraction process connects to that external file and starts
processing.
What do we do in the transformation stage of ETL?
The data is transformed to meet the schema and
requirements of the destination.Data transformation refers to converting the structure or format of a
data set to match that of the target system.
What do we do in the Loading stage of ETL?
It involves placing the data into the target system,
typically a cloud data warehouse, where it is ready
to be analyzed by BI tools or data analytics tools
Name the types of Load ?
Initial load, incremental load ,and full refresh
What do semantic models stand for ?
Semantic models can help business users abstract relationship
complexities and make it easier to analyze data quickly
What’s the difference between ETL and ELT?
Unlike ETL, where data transformation takes place in a staging
area before being loaded into the target system, ELT extracts
the raw data directly to the target system and transform it
there
OLAP
Online analytical processing (OLAP) is a technology that organizes
large business databases and supports complex analysis. It can be
used to perform complex analytical queries without negatively
affecting transactional systems
OLTP
The databases that a business uses to store all its transactions and
records are called online transaction processing (OLTP) databases.
What’s MDX
Multidimentional expressions ,* A calculation/query language to express queries for
online analytical processing -OLAP, in a database
management system.Language to define, use and retrieve data from
multidimensional objects