Lecture 9: Feature Eng Flashcards

Question 1

Q

Parameters

Answer

A

Parameter
 Parameters are components of the mathematical construct that
generalizes the dataset in the form of a model or equation, e.g,
coefficients of linear regression
 Values of parameters are mathematically derived or algorithmically
learned from the dataset
 Parameters are inherently internal to the model

Question 2

Q

Hyper-Parameter

Answer

A

Hyper-parameter
 Configuration variables whose values are not derived from the
dataset
 Hyper-parameters are inherently external to the model
 Hyper-parameter values must be decided prior to the training
process and are typically specified by the machine learning
engineer

Question 3

Q

Hyper-Parameter Tuning

Answer

A

Tuning of Hyper-parameters
 Hyper-parameters might be many
 e.g, for Decision Trees: Resampling method, Number of trees, Maximum
depth, Number of splits per node, Maximum number of samples per Leaf
etc.)
 Values of each of these hyper-parameters are numerous
 Resulting into numerous possible combinations (can be
visualized as a Grid)
 There might exists ONE combination that will result in the best
performing model
 Question is how do we find that combination?
 Trying out various combinations manually might be
overwhelming
 Essentially, an optimization problem
 Running the target algorithm for possible combinations of
hyper-parameters is known as parameter sweep

Question 4

Q

Hyper-parameter Tuning: Parameter Sweep Strategies

Answer

A

Entire Grid Sweep
 This is the case where the engineer doesn’t know the optimal
value of any of the hyper-parameters and needs to exhaust and
evaluate all possible options
 Running the predictor algorithm with all possible hyper-parameter
value combinations
 Most expensive option in terms of computational resource and run
time

Random Grid Sweep
 Used when one or some of the optimal values are known and fixed
manually
 The unknown ones are left for the sweep, where values are picked
randomly by the algorithm
 Can improves speed of execution significantly

Random Sweep
 Useful when some or all of the hyper-parameters are continuous
variables
 Algorithms picks random values within a range
 Number of iterations can be controlled by another hyper-parameter
 Used when one or limited number of hyper-parameters are
targeted

Question 5

Q

Overfitting / Underfitting: Other Techniques

Answer

A

K-Fold Cross Validation during Model Training
 K-Fold Randomized Stratified Data Partitioning
 Training & Test Data Split
 For Tree-based Algorithms: Tree Pruning