Supervised Learning Flashcards
What is Reinforcement Learning?
Software agent optimizes its behaviour based on rewards and punishments.
How to do VEDA for 2 categorical variables?
Using sns.barplot(x=”day”, y=”total_bill”, data=tips) where data is a df.
OR tips.boxplot(‘day’, ‘total_bill’)
How to do VEDA for binary categorical variables?
plt. figure()
sns. countplot(x=’education’, hue=’party’, data=df, palette=’RdBu’)
plt. xticks([0,1], [‘No’, ‘Yes’])
plt. show()
How to do pair-wise VEDA for 4 quantitative variables?
pd.scatter_matrix(df, c = y, figsize = [8, 8], marker = ‘D’)
What is accuracy?
Fraction of correct predictions.
How to access predictor values after removing target values from df?
df.drop(‘target’, axis=1).values
How to turn list of values into format for sklearn?
X.reshape(-1,1)
How to generate pairwise feature correlation VEDA?
sns.heatmap(df.corr(), square=True, cmap=’RdYlGn’)
What are a and b in y = ax +b ?
a is slope and b is y intercept
How to do k-fold cv with sklearn?
cross_val_score(reg, X, y, cv=k)
Why should regularization be used?
To penalize large coefficients and avoid over-fitting
What is Ridge regression?
regression with regularization where alpha (hyper)-parameter weighs the OLS. Should be first choice for regression over lasso.
What is Lasso regression?
regression with regularization where coefs can be set to 0 to remove unimportant features. Great for feature selection.
How to specify parameters for Lasso and access its coefficients?
lasso = Lasso(alpha=0.4, normalize=True)
lasso.coef_
When is accuracy a poor metric when only fraction of correct predictions is used?
When there is class imbalance where low freq items will never be correctly labeled. if maj class is 99%, then accuracy of 99% can be achieved with model that always picks maj class.
How to calculate accuracy using confusion matrix?
(TP+TN)/(TP+TN+FN+FP)
How to calculate precision using conf matrix?
TP/TP+FP
How to calculate recall using conf matrix?
TP/TP+FN
What is F1?
(2(precision*recall))/(precision+recall)
How to get classifier’s performance report using SKL?
sklearn.metrics.classification_report
What is the ROC curve?
The receiver operating characteristic curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.
What is a better than random AUC?
If the AUC is greater than 0.5, the model is better than random guessing.
How to use AUC during cross-validation?
cross_val_score(logreg, X, y, cv=5, scoring=’roc_auc’)
What are hyperparameters?
Parameters that cannot be learnt by fitting the model. Like k or alpha.
What is C in LogReg?
C controls the inverse of the regularization strength. A large C can lead to an overfit model, while a small C can lead to an underfit model.
How to grid search with SKL?
GridSearchCV(logreg, {‘C’: c_space}, cv=5)
What to do since scikit-learn does not accept non-numerical features?
Use one-hot encoding using sklearn or pandas.get_dummies(df)
What is the step used to avoid dropping nans?
Inputing data by making educated guess on what it should be, such as mean or mode:
imp = Imputer(missing_values=’NaN’, strategy=’most_frequent’, axis=0)
Where to find scaler in SKL?
from sklearn.preprocessing import StandardScaler
How to configure pipeline with scaler and k-nn clf?
pipeline([(‘scaler’, StandardScaler()),
(‘knn’, KNeighborsClassifier())])
What is gamma in SVM?
gammagamma controls the kernel coefficient
What does axis=1 mean in Pandas?
It means across columns (to the right)
What is log loss?
The metric (often used for multi-class classification) is negative the log likelihood of the model that says each test observation is chosen independently from a distribution that places the submitted probability mass on the corresponding class, for each observation. Log loss provides a steep penalty for predictions that are both wrong and confident.
How to see source of python function?
inspect.getsource
How to keep data distribution when using train/test split?
Stratification
How to limit size of text features?
Using hashing trick to perform dimensionality reduction.
How to add feature interaction in a model?
Using polynomials
How to change scale of plot axis?
plt.xscale(‘log’)