DS LAMS Flashcards
Stepss in ds pipeline
Problem formulation
Data visualisation
ML
Statisticl inference
Which steps relies heavily on algoritjmiv optimisation?
ML
How much
How many
Numeric
Isit type A or B
Classes
How isit organised?
Structure
Isit weird behaviour?
Anomaly
What shld be done next?
Action
To predict numeric
Regression
To predict classes
Classification
To detect cluster
Clustering
Detect anomaly
Anomaly detection
Predict action
Adaptive learning
Types of structured data
Numeric
Categorical
Mixed data
Time series
Network
Unstructured
Text
Image
Voice
Videos
Central tendency
Mean
Mediam
Dispersion
Standard deviation
Quantiles
Central tendency
Median
Mean
Box plot
3 lines
Q1,2,3
25,50,75
Correlation
Draw from x,y=0
Correlation is not casuality
Swarm plot
Only 1 variable is v confident
Confidence increase when data is more clear cut
Gini calculation
x/n(1-x/n)x 2