Session 1 Flashcards
Big Data (1997)
“Data sets are generally quite large, taxing the capacities of main memory, local disk, and even remote disk, we call this the problem of big data”
What characterizes Big Data?
- Volume
- Velocity
- Variety
- Veracity
MapReduce
Distributed computing paradigm
Different types of analytics
- Descriptive Analytics
- Diagnostic Analytics
- Predictive Analytics
- Prescriptive Analytics
Descriptive Analytics
What happened?
Provide insight into the data, so that one can better understand what data to collect and store and provide insight into ways to improve future models.
Visualization, Clustering, Summary Statistics
Diagnostic Analytics
Why did it happen?
provide insight into the data, so that one can better understand what data to collect and store and
provide insight into ways to improve future models.
Causal Analysis, Simulation
Predictive Analytics
What will happen?
is building a model to predict when something will
happen
Prediction, Machine Learning
Prescriptive Analytics
How can we make it happen?
automates action to be taken based on prediction
Optimization, Planning, Automation
Descriptive / Diagnostic Analytics - Tasks
- Data visualization
- Clustering
- Co-occurrence grouping
Predictive Analytics
- Classification
- Regression
- Link prediction
Prescriptive Analytics
- Uplift Modeling
- Automation