DAT Data Scale Flashcards

1
Q

The Vs of Big Data?

A
Variety (many types)
Volume
Velocity
Veracity (many sources)
Value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define Big Data

A

Data which is beyond the capacity of traditional processing technologies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the different layers of Data Structuring?

A

Structured - Well defined model
Semi-structured - Definition embedded within data, eg XML
Quasi-structured - Erratic structure, eg web click streams
Unstructured - No structure, eg image, text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the Challenges of Scale?

A
Infrastructure
Architectural complexities
Security
Data quality
Ethics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to deal with Scale?

A

Two options:
Apache/Hadoop - Open source framework. Uses non-specialised computers.
Cloud - On demand computing capacity, eg AWS, Azure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Master Data Management?

A

Organisations capture data about the same real world items, with slight differences in data structure and value.
MDM is setting one dataset as the Canonical Gold Standard, representing the absolute truth for reference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are Controlled Vocabularies?

A

They provide a reference system for data and terms, promoting consistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the stages of Data Migration?

A
Project based, not process.
Selection
Preparation
Extraction
Transformation
Deposition
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some types of Data Migration?

A

Database Migration
Application Migration
Business Process Migration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the selection criteria for Data Integration Tools?

A

Future Scalability
Implementation
Support Costs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Data Synchronisation?

A

Ensuring consistency when an organisation has multiple copies of data.

A data steward oversees the day-to-day management of the dataset and sets validation rules. Data owner approves these rules and sets the framework for managing the dataset.

To think about: ownership, updates, format, security, data quality, performance, maintenance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly