Chapter 9 Flashcards
What is Big Data?
Complex and big data sets.
Where does Big Data come from?
Social media,
Smart phones,
Sensors.
What is primary data in Big Data?
Specifically collected for research purpose.
What is secondary data in Big Data?
Not specificaly collected for research.
Which three characteristics of Big Data are good for research?
1: Big,
2: Always on
3: Nonreactive.
Which 7 characteristics of Big Data are bad for research?
1: Incomplete
2: Inaccessible
3: Nonrepresentative
4: Drifting
5: Algorithmically confounded
6: Dirty
7: Sensitive
Why is Big an advantage?
With rare events you get enough data when there is heterogenicity.
Why is always on an advantage?
You have real-time measurements. (Spotting trends)
Why is nonreactive an advantage?
Measuring Big Data sources is less likely to change behavior
Why is incomplete an disadvantage?
Leaving out important data.
Why is inaccessible a disadvantage?
Legal and complience of data acces.
Why is unrepresentative an disadvantage?
E.g. consumers saying reviews are important but the data is not always valid.
Why is drifting an disadvantage?
Big Data source can change, the users, the usage or the platform.
Why is algorithmically confounded an disadvantage?
Design of the platform can change behavior. (FB encourages atleast 20 friends so minimum of friends is not easy to study)
Why is dirty a disdvantage?
Big Data can be loaded with junk.