8.2 Data engineering, extraction and mining Flashcards

1
Q

Data engineering is the practice of

A
  • designing and building systems for collecting, storing and analysing large sets of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The three stages in blending data from: (Pg 209)

A
  • multiple sources into a destination system (eg data warehouse) are (ETL):
    • Extraction
    • Transformation
    • Loading
  • ETL system provides data in a clean and ready to use format
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Extraction is the process of

A

-harvesting data from multiple sources
- prior to this is data profiling which involves analysing data to understand it’s content, format and structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Transformation takes the

A
  • extracted data and transforms it into a suitable format for the destination database and intended use
  • this is done using code and rules designed to interrogate the source data and then turn it into new format as per code instructions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Loading is when the

A
  • newly cleaned and prepared data is uploaded into the destination database ready for use
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A data warehouse is a

A
  • store for data that has been loaded in ETL process
  • it will be stored in a systematic and logical way ready for further interrogation and analysis by bus intelligence function.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Business intelligence (BI) is the

A
  • technology driven process of analysing bus data to create insightful and actionable info to help improve the operations or products of bus
  • present finding using data visualisation techniques and traditional summaries and reports
  • Data mining is an important component of BI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data mining is the

A
  • process of uncovering patterns and other valuable information from large sets of data in the data warehouse
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Big data, ETL and BI - main challenges for traditional ETL programs:

A
  • Rate of growth - data volumes are growing at unprecedented levels
  • Types and sources - data for modern bus comes in all shapes and sizes and from multiple internal and external sources
  • New data tech - use innovative techniques to handle the challenges of big data - volume, variety, veracity and velocity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The FF’s role in data engineering, extraction and mining

A
  • they will use their unique set of competencies to act as an interface between data specialists and the bus
  • making the generated data commercially relevant and transforming it into valuable info
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data engineering, extraction and mining and the information to impact framework (pg 212) - they will add value through:

A
  • Assembling info - raw data from variety of relevant sources is collected and cleaned and transformed into meaningful info
  • Generating insights - financial and non financial info is analysed for insights to improve performance
  • Influencing decision makers - insights are used to advice and influence the relevant stakeholders
  • Achieving impact - guide actions to help achieve the desired impact
How well did you know this?
1
Not at all
2
3
4
5
Perfectly