big data Flashcards

1
Q

define big data

A

too big or too complicated to be managed using normal techniques
no universally agreed definition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the four V’s of big data

A

Volume; the size data, terabytes or even exabytes to process.
Velocity; the speed at which data flows, streaming milliseconds to record real time. sensor in a car.
Variety; the validity of the data, data inconsistency, latency
Veracity; the nature of the data (structured and unstructured formats), structured - databases, semi-structured - sml, unstructured - web search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the problem with big data

A

storage. especially when the data is in different formats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is big data used for

A

data mining, data storage, data analysis, data sharing, data visualisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

examples of big data

A
  • The New York Stock Exchange generates about one
    terabyte of new trade data every day
  • Facebook generates 4 new petabytes of data per
    day. Mainly in terms of photo and video uploads,
    message exchanges, comments etc.
  • A single Jet engine can generate 10+ terabytes of
    data per flight
  • Walmart processes 40 Petabytes of data, per day
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

issues with big data

A
  • Data is not knowledge
  • It is very possible to be data rich but information
    poor (DRIP)
  • Large data sets are often collected opportunistically,
    and not for the purpose of answering the question
    that you are now interested in
  • Long term longitudinal data sets are often difficult
    to exploit
  • Privacy, confidentiality and ethical considerations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly