Scikit-learn/SKLL | Basics | Priority Flashcards
Idiom for converting a pandas series for modeling starting from a dataframe and using feature transformation.
Machine Learning with PyTorch and Scikit-Learn Chapter 9 p280
> > > y = df[‘SalePrice’].values
y_std = sc_y.fit_transform(y[:, np.newaxis]).flatten()
lr.fit(X_std, y_std)
Machine Learning with PyTorch and Scikit-Learn Chapter 9 p280
Example of how to standardize variables for modeling (both data and labels).
> > > X = df[[‘Gr Liv Area’]].values
y = df[‘SalePrice’].values
from sklearn.preprocessing import StandardScaler
sc_x = StandardScaler()
sc_y = StandardScaler()
X_std = sc_x.fit_transform(X)
y_std = sc_y.fit_transform(y[:, np.newaxis]).flatten()
lr = LinearRegressionGD(eta=0.1)
lr.fit(X_std, y_std)
Machine Learning with PyTorch and Scikit-Learn Chapter 9 p280
Example of linear regression including training and prediction.
> > > from sklearn.linear_model import LinearRegression
slr = LinearRegression()
slr.fit(X, y)
y_pred = slr.predict(X)
Machine Learning with PyTorch and Scikit-Learn Chapter 9 p283
Confusion matrix in sklearn.
> > > from sklearn.metrics import confusion_matrix
confmat = confusion_matrix(y_true=y_test, y_pred=y_pred)
Machine Learning with PyTorch and Scikit-Learn Chapter 6 p194
Get a scorer with different positive label.
> > > from sklearn.metrics import make_scorer
scorer = make_scorer(f1_score, pos_label=0)
Machine Learning with PyTorch and Scikit-Learn Chapter 6 p197
Functions for precision, recall, F1, Matthews’ coefficient.
precision_score, recall_score, f1_score, matthews_corrcoef
Machine Learning with PyTorch and Scikit-Learn Chapter 6 p197
Functions for ROC curve, AUC, and ROC AUC as a single value.
roc_curve, auc, roc_auc_score
Machine Learning with PyTorch and Scikit-Learn Chapter 6 p199
What variant of macro-/micro-averaging does sklearn use by default for muti-class problems?
weighted macro average
Machine Learning with PyTorch and Scikit-Learn Chapter 6 p201
Idiom to predict the class label of a single example.
Machine Learning with PyTorch and Scikit-Learn Chapter 3 p72
> > > lr.predict(X_test_std[0, :].reshape(1, -1))
Function: probability that training examples belong to a certain class.
Machine Learning with PyTorch and Scikit-Learn Chapter 3 p72
predct_proba()
Example of how to fit a logistic regression model.
Machine Learning with PyTorch and Scikit-Learn Chapter 3 p70
”»> from sklearn.linear_model import LogisticRegression
»> lr = LogisticRegression(C=100.0, solver=’lbfgs’,
… multi_class=’ovr’)
»> lr.fit(X_train_std, y_train)”
Function to compute a classifier’s prediction accuracy by combining the predict call with accuracy_score.
Machine Learning with PyTorch and Scikit-Learn Chapter 3 p57
”»> print(‘Accuracy: %.3f’ % ppn.score(X_test_std, y_test))
Accuracy: 0.978”
Idiom to print the number of misclassified examples.
Machine Learning with PyTorch and Scikit-Learn Chapter 3 p56
print(f’Misclassified examples: {(y_test != y_pred).sum()}’)
Get a train/test split.
Machine Learning with PyTorch and Scikit-Learn Chapter 3 p55
”»> from sklearn.model_selection import train_test_split
»> X_train, X_test, y_train, y_test = train_test_split(
… X, y, test_size=0.3, random_state=1, stratify=y
… )”