Big Data Flashcards

1
Q

Big Data

A

An exponential growth in the amount of data between 2009-2020; the sheer growth is what makes big data big.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Your data

A

Social medias will scrape everything you type or read to formulate ads. Anything you sign up or use for free will use or sell data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is big data?

A

All encompassing term including data, data frameworks and the tools and techniques used to process and analyse data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

5 ‘V’s of Big Data

A

Volume, Velocity, Variety, Veracity and Value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Volume

A

Data at Rest: huge volumes of data generated from various sources such as social media, machines, networks and interactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Velocity

A

Data in Motion: refers to the speed at which data is being created in real time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variety

A

Data in Many Forms: data is gathered from multiple forms: pdfs, emails, video, social media posts, location etc.

Unstructured sources of data pose issues for data analysis and storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Veracity

A

Data in Doubt: uncertainty due to inconsistency and incompleteness, ambiguities, latency, deception and approximations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Value

A

Data in Worth: value of the data to a business in terms of its ability to generate profit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Structured Data

A

Guarantees that every entry of data has the same format e.g spreadsheet/csv columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Unstructured Data

A

Search results; website links, images, videos - all in different formats and structures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Semi-structured data

A

XMR document: a combination of both structured and unstructured data. Structured in form, but with less constraints than structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data Ubiquity

A

Automatic data capture, opening up of existing data, simulations, approximations, synthetic data, exponential growth in storage, increase in bandwidth, faster algorithms

Reason exponential growth of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Examples of Big Data

A

NYSE generates one terabyte of new trade data per day
Facebook generates 4 petabytes of data per day
Single jet engine can generate 10 terabytes of data per flight
Walmart processes 40 petabytes of data per day

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data-Information-Knowledge-Wisdom Model

A
  1. Raw data (red)
  2. Meaning of data (Traffic light has turned red)
  3. Context of data (The traffic light I’m driving towards has turned red)
  4. Data is then applied (Stop the car)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Issues in Big Data

A
  • Data is not knowledge - it is an upstream of information

- Possible to be data rich but information poor (DRIP)

17
Q

Data Science

A

Data science is concerned with analysing primary or experimental data

18
Q

Descriptive analytics

A

Transactional and interactional data used to identify trends

19
Q

Diagnostic analysis

A

Historical data can be measured against other data to find out why something happened

20
Q

Predictive analysis

A

Tells us what is most likely to happen via diagnostic and descriptive analytics

21
Q

Prescriptive analysis

A

Prescribe what action to take to eliminate a future problem or take advantage of a trend