Lecture 1 Flashcards
Tall Data
many observation, few variables
Wide data
few observations, many variables
Advantages of Big Data
1.Big
2.Always on
3.Nonreactive ( people are not aware they have been recorded)
Incomplete data
Some info is missing
Inaccessible data
ex.PPD
Outside of the organization:
business and ethical barriers to access the data
Inside of the organization:
databases are not integrated within the system
Unrepresentative data
Invalid data
Dirty data
Loaded with junk or spam
Ex. Twitter bots; Fake reviews
Sensitive data
releasing privacy or confidential details
Big data
a collection of complex data sets, which uses tools and models to extract insights from it
primary data
data collected to answer a research data
secondary data
data collected for non-research purposes
Uses of Big data
1.Personalization
(recommendation algorithms)
2.Boosting engagement
(Facebook likes)
3.New product development
- Reducing churn
( when a customer quits) - Public for economy
Customer churn
customer quits some service
Is Big Data biased or unbiased?
Biased
Insights of Big Data
Big is relative
Data quality