Algorithms Flashcards
algorithm for binary classification, multiple features, not enough data for a neural net
binary logistic regression
algorithm for predicting a ranking (number of stars given to a movie or something like that)
ordinal logistic regression
image classification algorithm?
CNN
time series forecasting algorithms?
bayesian structural time series models, LSTMs, RNNs, ARIMA, SARIMA, SARIMAX
5 assumptions of linear regression?
1) linear relation between features and target
2) little or no multicollinearity between features
3) homeoskedasticity
4) residuals (uncertainty) follow a normal distribution
5) Little or no autocorrelation in residuals (usually happens in time series where one step is dependent on the last).
If we have two correlated features, what do we do?
Combine them to make an independent feature or drop one
Dimensionality reduction?
PCA
What metric is best for linear regression and why?
R^2; for linear regression, it represents (explained variance)/(total variance)
How does a decision tree decide what split to make at each node?
For classification: minimize Gini impurity
For regression: