Introduction Flashcards

1
Q

how is “Big Data” both a noun and an adjective?

A

we have big data, and we use big data tools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

big data is so big that (definition):

A

traditional data processing application software is inadequate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the four v’s?

A

volume, velocity, variety, veracity (disinformation?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

data was always powerful, what changed?

A

new sources, insignificant data is actually most valuable (niche products, long tail model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

5 use cases of big data

A

remote patient monitoring: preventative care
product sensors: manufacturing support
real time location data: geo-advertising
public surveys: tailored public services
social media: marketing, retail

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

big data acquisition challenges (besides the main storage problem)

A

selection, filtering & compression, collecting metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Big data doesn’t have to be too big

A

we process a lot to understand which data is actually valuable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

fundamental to understand measure, and control the data

A

metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

big data processing challenges

A

single machine limitation, parallelization, fault-tolerance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

3 main scenarios for programming solutions for big data

A

batch, interactive (process quickly) and streaming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a data lake?

A

central repository for data kept in original format and queried only when needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

NoSQL

A

not only sql, scalable version, no acid, no standard for modeling or querying

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

NewSQL

A

combine benefits from relational and NoSQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

two main approaches listed under analytics techniques within big data framework

A

data mining and machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly