Methods Flashcards

1
Q

LOCF

A

Last Observation Carried Forward. Imputation method for missing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

MICE

A

Multiple Imputation by Chained Equations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SMOTE

A

Synthetic Minority Oversampling Technique (SMOTE) is a statistical technique for increasing the number of cases in your dataset in a balanced way. The module works by generating new instances from existing minority cases that you supply as input. This implementation of SMOTE does not change the number of majority cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Probabilistic PCA

A

Can be used for imputation. Probabilistic PCA generalizes classical PCA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Entropy MDL

A

A binning method. This method requires that you select the column you want to predict and the column or columns that you want to group into bins. It then makes a pass over the data and attempts to determine the number of bins that minimizes the entropy. In other words, it chooses a number of bins that allows the data column to best predict the target column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

PQuantile

A

Normalization that happens after binning. Note that normalizing values transforms the values, but does not affect the final number of bins. Values are normalized within the range [0,1]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SHAP explainer

A

Uses Shapley values to explain any machine learning model or python function.

  • Global interpretability — the SHAP values can show how much each predictor contributes, either positively or negatively, to the target variable.
    Local interpretability — each observation gets its own set of SHAP values. This greatly increases its transparency. We can explain why a case receives its prediction and the contributions of the predictors.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mimic explainer

A

Mimic explainer is based on the idea of training global surrogate models to mimic blackbox models. A global surrogate model is an intrinsically interpretable model that is trained to approximate the predictions of any black box model as accurately as possible. Data scientists can interpret the surrogate model to draw conclusions about the black box model. You can use one of the following interpretable models as your surrogate model: LightGBM (LGBMExplainableModel), Linear Regression (LinearExplainableModel), Stochastic Gradient Descent explainable model (SGDExplainableModel), and Decision Tree (DecisionTreeExplainableModel).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

PFI

A

Permutation Feature Importance is a technique used to explain classification and regression models that is inspired by Breiman’s Random Forests paper (see section 10). At a high level, the way it works is by randomly shuffling data one feature at a time for the entire dataset and calculating how much the performance metric of interest changes. The larger the change, the more important that feature is. PFI can explain the overall behavior of any underlying model but does not explain individual predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fast Forest Quantile Regression

A

This article describes how to use the Fast Forest Quantile Regression module in Machine Learning Studio (classic), to create a regression model that can predict values for a specified number of quantiles.

Quantile regression is useful if you want to understand more about the distribution of the predicted value, rather than get a single mean prediction value. This method has many applications, including.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Boosted Decision Tree Regression

A

Boosting means that each tree is dependent on prior trees. The algorithm learns by fitting the residual of the trees that preceded it. Thus, boosting in a decision tree ensemble tends to improve accuracy with some small risk of less coverage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Demographic parity constraint

A

Mitigate allocation harms in binary classification and regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Equalized odds constraint

A

Diagnose allocation and quality-of-service harms in Binary classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Equal opportunity constraint

A

Diagnose allocation and quality-of-service harms in Binary classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Bounded group loss constraint

A

Mitigate quality-of-service harms in regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fairness algorithms for reduction

A

Reduction: These algorithms take a standard black-box machine learning estimator (e.g., a LightGBM model) and generate a set of retrained models using a sequence of re-weighted training datasets. For example, applicants of a certain gender might be up-weighted or down-weighted to retrain models and reduce disparities across different gender groups. Users can then pick a model that provides the best trade-off between accuracy (or other performance metric) and disparity, which generally would need to be based on business rules and cost calculations.

17
Q

Fairness algorithms for post-processing

A

Post-processing: These algorithms take an existing classifier and the sensitive feature as input. Then, they derive a transformation of the classifier’s prediction to enforce the specified fairness constraints. The biggest advantage of threshold optimization is its simplicity and flexibility as it does not need to retrain the model.

18
Q

Differential privacy

A

Differential privacy is a set of systems and practices that help keep the data of individuals safe and private.

19
Q

Epsilon in the context of privacy

A

A value known as epsilon measures how noisy or private a report is. Epsilon has an inverse relationship to noise or privacy. The lower the epsilon, the more noisy (and private) the data is.

20
Q

Matchbox Recommender

A

How this works: When a user is relatively new to the system, predictions are improved by making use of the feature information about the user, thus addressing the well-known “cold-start” problem. However, once you have collected a sufficient number of ratings from a particular user, it is possible to make fully personalized predictions for them based on their specific ratings rather than on their features alone. Hence, there is a smooth transition from content-based recommendations to recommendations based on collaborative filtering. Even if user or item features are not available, Matchbox will still work in its collaborative filtering mode.

21
Q

Parameter sweeping mode: Entire grid

A

Entire grid: When you select this option, the module loops over a grid predefined by the system, to try different combinations and identify the best learner. This option is useful for cases where you don’t know what the best parameter settings might be and want to try all possible combination of values.

22
Q

Parameter sweeping mode: Random sweep

A

Random sweep: When you select this option, the module will randomly select parameter values over a system-defined range. You must specify the maximum number of runs that you want the module to execute. This option is useful for cases where you want to increase model performance using the metrics of your choice but still conserve computing resources.