Scikit-learn/SKLL | Basics | Priority Flashcards

Question 1

Q

Idiom for converting a pandas series for modeling starting from a dataframe and using feature transformation.

Machine Learning with PyTorch and Scikit-Learn Chapter 9 p280

Answer

A

> > > y = df[‘SalePrice’].values
y_std = sc_y.fit_transform(y[:, np.newaxis]).flatten()
lr.fit(X_std, y_std)

Machine Learning with PyTorch and Scikit-Learn Chapter 9 p280

Question 2

Q

Example of how to standardize variables for modeling (both data and labels).

Answer

A

> > > X = df[[‘Gr Liv Area’]].values
y = df[‘SalePrice’].values
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
sc_y = StandardScaler()
X_std = sc_x.fit_transform(X)
y_std = sc_y.fit_transform(y[:, np.newaxis]).flatten()
lr = LinearRegressionGD(eta=0.1)
lr.fit(X_std, y_std)

Machine Learning with PyTorch and Scikit-Learn Chapter 9 p280

Question 3

Q

Example of linear regression including training and prediction.

Answer

A

> > > from sklearn.linear_model import LinearRegression
slr = LinearRegression()
slr.fit(X, y)
y_pred = slr.predict(X)

Machine Learning with PyTorch and Scikit-Learn Chapter 9 p283

Question 4

Q

Confusion matrix in sklearn.

Answer

A

> > > from sklearn.metrics import confusion_matrix
confmat = confusion_matrix(y_true=y_test, y_pred=y_pred)

Machine Learning with PyTorch and Scikit-Learn Chapter 6 p194

Question 5

Q

Get a scorer with different positive label.

Answer

A

> > > from sklearn.metrics import make_scorer
scorer = make_scorer(f1_score, pos_label=0)

Machine Learning with PyTorch and Scikit-Learn Chapter 6 p197

Question 6

Q

Functions for precision, recall, F1, Matthews’ coefficient.

Answer

A

precision_score, recall_score, f1_score, matthews_corrcoef

Machine Learning with PyTorch and Scikit-Learn Chapter 6 p197

Question 7

Q

Functions for ROC curve, AUC, and ROC AUC as a single value.

Answer

A

roc_curve, auc, roc_auc_score

Machine Learning with PyTorch and Scikit-Learn Chapter 6 p199

Question 8

Q

What variant of macro-/micro-averaging does sklearn use by default for muti-class problems?

Answer

A

weighted macro average

Machine Learning with PyTorch and Scikit-Learn Chapter 6 p201

Question 9

Q

Idiom to predict the class label of a single example.

Machine Learning with PyTorch and Scikit-Learn Chapter 3 p72

Answer

A

> > > lr.predict(X_test_std[0, :].reshape(1, -1))

Question 10

Q

Function: probability that training examples belong to a certain class.

Machine Learning with PyTorch and Scikit-Learn Chapter 3 p72

Answer

A

predct_proba()

Question 11

Q

Example of how to fit a logistic regression model.

Machine Learning with PyTorch and Scikit-Learn Chapter 3 p70

Answer

A

”»> from sklearn.linear_model import LogisticRegression
»> lr = LogisticRegression(C=100.0, solver=’lbfgs’,
… multi_class=’ovr’)
»> lr.fit(X_train_std, y_train)”

Question 12

Q

Function to compute a classifier’s prediction accuracy by combining the predict call with accuracy_score.

Machine Learning with PyTorch and Scikit-Learn Chapter 3 p57

Answer

A

”»> print(‘Accuracy: %.3f’ % ppn.score(X_test_std, y_test))
Accuracy: 0.978”

Question 13

Q

Idiom to print the number of misclassified examples.

Machine Learning with PyTorch and Scikit-Learn Chapter 3 p56

Answer

A

print(f’Misclassified examples: {(y_test != y_pred).sum()}’)

Question 14

Q

Get a train/test split.

Machine Learning with PyTorch and Scikit-Learn Chapter 3 p55

Answer

A

”»> from sklearn.model_selection import train_test_split
»> X_train, X_test, y_train, y_test = train_test_split(
… X, y, test_size=0.3, random_state=1, stratify=y
… )”