8.2 Data engineering, extraction and mining Flashcards
1
Q
Data engineering is the practice of
A
- designing and building systems for collecting, storing and analysing large sets of data
2
Q
The three stages in blending data from: (Pg 209)
A
- multiple sources into a destination system (eg data warehouse) are (ETL):
- Extraction
- Transformation
- Loading
- ETL system provides data in a clean and ready to use format
3
Q
Extraction is the process of
A
-harvesting data from multiple sources
- prior to this is data profiling which involves analysing data to understand it’s content, format and structure
4
Q
Transformation takes the
A
- extracted data and transforms it into a suitable format for the destination database and intended use
- this is done using code and rules designed to interrogate the source data and then turn it into new format as per code instructions
5
Q
Loading is when the
A
- newly cleaned and prepared data is uploaded into the destination database ready for use
6
Q
A data warehouse is a
A
- store for data that has been loaded in ETL process
- it will be stored in a systematic and logical way ready for further interrogation and analysis by bus intelligence function.
7
Q
Business intelligence (BI) is the
A
- technology driven process of analysing bus data to create insightful and actionable info to help improve the operations or products of bus
- present finding using data visualisation techniques and traditional summaries and reports
- Data mining is an important component of BI
8
Q
Data mining is the
A
- process of uncovering patterns and other valuable information from large sets of data in the data warehouse
9
Q
Big data, ETL and BI - main challenges for traditional ETL programs:
A
- Rate of growth - data volumes are growing at unprecedented levels
- Types and sources - data for modern bus comes in all shapes and sizes and from multiple internal and external sources
- New data tech - use innovative techniques to handle the challenges of big data - volume, variety, veracity and velocity
10
Q
The FF’s role in data engineering, extraction and mining
A
- they will use their unique set of competencies to act as an interface between data specialists and the bus
- making the generated data commercially relevant and transforming it into valuable info
11
Q
Data engineering, extraction and mining and the information to impact framework (pg 212) - they will add value through:
A
- Assembling info - raw data from variety of relevant sources is collected and cleaned and transformed into meaningful info
- Generating insights - financial and non financial info is analysed for insights to improve performance
- Influencing decision makers - insights are used to advice and influence the relevant stakeholders
- Achieving impact - guide actions to help achieve the desired impact