Interview Prep Flashcards
Bias-Variance Tradeoff
Bias is error due to erroneous or overly simplistic assumptions in algorithm. This can lead to the model underfitting your data, making it hard to generalize your knowledge from the training set to the test set.
Variance is error due to too much complexity in algorithm. Leads to the algorithm being highly sensitive and overfitting.
Supervised vs unsupervised learning
Supervised = used data that is labeled
Unsupervised = not labeled
KNN vs k-means clustering
KNN = supervised classification algorithm
K-means clustering is unsupervised
ROC Curve
Receiver operating characteristic –> representation of true positive rates vs false positive rates
Precision vs. recall vs. accuracy
Email spam
Precision = TP / predicted positive (actual cancer patients / pred cancer patients)
Recall = TP / Real positive (Actual cancer patients / actual cancer patients + cancer patients not predicated)
Type I vs Type II
Type I = False positive
Type II = False negative
Cross-validation
Holding out different parts of the data to test the model on
Decision tree pruning
Branches that have lower predictive power are removed in order to reduce complexity
F1 score
weighted avg between recall and precision
1 = best 0 = worst
How to avoid overfitting
Keep models simple
Use regularization (Lasso / Ridge)
Use cross-validation
Examples of ensemble
Decision Tree + Boosting
How would you handle missing or incomplete data?
Delete row or decide to replace with 0 or another value (mean or median) or prediction
Write pseudocode for linear regression
1) Define cost function (total squared error) to minimize
2) Initialize gradient descent
3) Iterate on gradient decent based on alpha (learning rate) and number of iterations
4) Analyze results
Why and how to normalize data
Ensure all data is on the same scale
= x - mean / SD
Data analysis process
1) Get access to data
2) EDA to get familiar with data + understand any issues
3) Determine how to prepare the data
4) Explore models to use
5) Execute, visualize, and inform