Scikit-learn/SKLL | Basics | Priority Flashcards

1
Q

Idiom for converting a pandas series for modeling starting from a dataframe and using feature transformation.

A

> > > y = df[‘SalePrice’].values
y_std = sc_y.fit_transform(y[:, np.newaxis]).flatten()
lr.fit(X_std, y_std)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Example of how to standardize variables for modeling (both data and labels).

A

> > > X = df[[‘Gr Liv Area’]].values
y = df[‘SalePrice’].values
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
sc_y = StandardScaler()
X_std = sc_x.fit_transform(X)
y_std = sc_y.fit_transform(y[:, np.newaxis]).flatten()
lr = LinearRegressionGD(eta=0.1)
lr.fit(X_std, y_std)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Example of linear regression including training and prediction.

A

> > > from sklearn.linear_model import LinearRegression
slr = LinearRegression()
slr.fit(X, y)
y_pred = slr.predict(X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Confusion matrix in sklearn.

A

> > > from sklearn.metrics import confusion_matrix
confmat = confusion_matrix(y_true=y_test, y_pred=y_pred)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Get a scorer with different positive label.

A

> > > from sklearn.metrics import make_scorer
scorer = make_scorer(f1_score, pos_label=0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Functions for precision, recall, F1, Matthews’ coefficient.

A

precision_score, recall_score, f1_score, matthews_corrcoef

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Functions for ROC curve, AUC, and ROC AUC as a single value.

A
roc_curve, auc, roc_auc_score
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What variant of macro-/micro-averaging does sklearn use by default for muti-class problems?

A

weighted macro average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Idiom to predict the class label of a single example.

A

> > > lr.predict(X_test_std[0, :].reshape(1, -1))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Function: probability that training examples belong to a certain class.

A

predct_proba()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Example of how to fit a logistic regression model.

A

”»> from sklearn.linear_model import LogisticRegression
»> lr = LogisticRegression(C=100.0, solver=’lbfgs’,
… multi_class=’ovr’)
»> lr.fit(X_train_std, y_train)”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Function to compute a classifier’s prediction accuracy by combining the predict call with accuracy_score.

A

”»> print(‘Accuracy: %.3f’ % ppn.score(X_test_std, y_test))
Accuracy: 0.978”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Idiom to print the number of misclassified examples.

A

print(f’Misclassified examples: {(y_test != y_pred).sum()}’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Get a train/test split.

A

”»> from sklearn.model_selection import train_test_split
»> X_train, X_test, y_train, y_test = train_test_split(
… X, y, test_size=0.3, random_state=1, stratify=y
… )”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly