Supervised Learning Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is Reinforcement Learning?

A

Software agent optimizes its behaviour based on rewards and punishments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to do VEDA for 2 categorical variables?

A

Using sns.barplot(x=”day”, y=”total_bill”, data=tips) where data is a df.
OR tips.boxplot(‘day’, ‘total_bill’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to do VEDA for binary categorical variables?

A

plt. figure()
sns. countplot(x=’education’, hue=’party’, data=df, palette=’RdBu’)
plt. xticks([0,1], [‘No’, ‘Yes’])
plt. show()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to do pair-wise VEDA for 4 quantitative variables?

A

pd.scatter_matrix(df, c = y, figsize = [8, 8], marker = ‘D’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is accuracy?

A

Fraction of correct predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to access predictor values after removing target values from df?

A

df.drop(‘target’, axis=1).values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to turn list of values into format for sklearn?

A

X.reshape(-1,1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to generate pairwise feature correlation VEDA?

A

sns.heatmap(df.corr(), square=True, cmap=’RdYlGn’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are a and b in y = ax +b ?

A

a is slope and b is y intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to do k-fold cv with sklearn?

A

cross_val_score(reg, X, y, cv=k)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why should regularization be used?

A

To penalize large coefficients and avoid over-fitting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Ridge regression?

A

regression with regularization where alpha (hyper)-parameter weighs the OLS. Should be first choice for regression over lasso.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Lasso regression?

A

regression with regularization where coefs can be set to 0 to remove unimportant features. Great for feature selection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to specify parameters for Lasso and access its coefficients?

A

lasso = Lasso(alpha=0.4, normalize=True)

lasso.coef_

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When is accuracy a poor metric when only fraction of correct predictions is used?

A

When there is class imbalance where low freq items will never be correctly labeled. if maj class is 99%, then accuracy of 99% can be achieved with model that always picks maj class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How to calculate accuracy using confusion matrix?

A

(TP+TN)/(TP+TN+FN+FP)

17
Q

How to calculate precision using conf matrix?

A

TP/TP+FP

18
Q

How to calculate recall using conf matrix?

A

TP/TP+FN

19
Q

What is F1?

A

(2(precision*recall))/(precision+recall)

20
Q

How to get classifier’s performance report using SKL?

A

sklearn.metrics.classification_report

21
Q

What is the ROC curve?

A

The receiver operating characteristic curve is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.

22
Q

What is a better than random AUC?

A

If the AUC is greater than 0.5, the model is better than random guessing.

23
Q

How to use AUC during cross-validation?

A

cross_val_score(logreg, X, y, cv=5, scoring=’roc_auc’)

24
Q

What are hyperparameters?

A

Parameters that cannot be learnt by fitting the model. Like k or alpha.

25
Q

What is C in LogReg?

A

C controls the inverse of the regularization strength. A large C can lead to an overfit model, while a small C can lead to an underfit model.

26
Q

How to grid search with SKL?

A

GridSearchCV(logreg, {‘C’: c_space}, cv=5)

27
Q

What to do since scikit-learn does not accept non-numerical features?

A

Use one-hot encoding using sklearn or pandas.get_dummies(df)

28
Q

What is the step used to avoid dropping nans?

A

Inputing data by making educated guess on what it should be, such as mean or mode:
imp = Imputer(missing_values=’NaN’, strategy=’most_frequent’, axis=0)

29
Q

Where to find scaler in SKL?

A

from sklearn.preprocessing import StandardScaler

30
Q

How to configure pipeline with scaler and k-nn clf?

A

pipeline([(‘scaler’, StandardScaler()),

(‘knn’, KNeighborsClassifier())])

31
Q

What is gamma in SVM?

A

gammagamma controls the kernel coefficient

32
Q

What does axis=1 mean in Pandas?

A

It means across columns (to the right)

33
Q

What is log loss?

A
The metric (often used for multi-class classification) is negative the log likelihood of the model that says each test observation is chosen independently from a distribution that places the submitted probability mass on the corresponding class, for each observation.
Log loss provides a steep penalty for predictions that are both wrong and confident.
34
Q

How to see source of python function?

A

inspect.getsource

35
Q

How to keep data distribution when using train/test split?

A

Stratification

36
Q

How to limit size of text features?

A

Using hashing trick to perform dimensionality reduction.

37
Q

How to add feature interaction in a model?

A

Using polynomials

38
Q

How to change scale of plot axis?

A

plt.xscale(‘log’)