7.1 - Types of Learning and Overfitting Problems Flashcards by Sam Patterson

_______ _________ uses labeled training data to guide the ML program toward superior forecasting accuracy.

Supervised learning

How well did you know this?

Not at all

Perfectly

Typical tasks for supervised learning include _________ and __________.

classification ; regression

How well did you know this?

Not at all

Perfectly

If the target variable is ___________, the model involved is a regression model.

continuous

How well did you know this?

Not at all

Perfectly

Multiple regression is an example of ________ learning.

supervised

How well did you know this?

Not at all

Perfectly

Classification models are used in cases where the target variable is ________ or _________ (e.g., ranking).

categorical ; ordinal

How well did you know this?

Not at all

Perfectly

Algorithms can be designed for __________ classification (e.g., classifying companies as likely to default vs. not likely to default) or _____________ classification (e.g., a ratings class for bonds).

binary ; multicategory

How well did you know this?

Not at all

Perfectly

In ____________ learning, the ML program is not given labeled training data; instead, inputs (i.e., _________) are provided without any _____________ about those inputs.

unsupervised ; features ; conclusions

How well did you know this?

Not at all

Perfectly

___________ is an example of an unsupervised ML program.

Clustering

How well did you know this?

Not at all

Perfectly

In __________ learning, in the absence of any ________ variable, the program seeks out structure or interrelationships in the data.

unsupervised ; target

How well did you know this?

Not at all

Perfectly

_______ _________ algorithms are used for complex tasks such as facial (image) recognition, natural language processing, etc.

Deep learning

How well did you know this?

Not at all

Perfectly

Programs that learn from their own prediction errors are called _________ ____________ algorithms.

reinforced learning

How well did you know this?

Not at all

Perfectly

Deep learning algorithms and reinforced learning algorithms are based on ________ ________ , a group of ML algorithms applied to problems with significant _____________ .

neural networks; nonlinearities

How well did you know this?

Not at all

Perfectly

__________ is an issue with supervised ML that results when a large number of ________ (i.e., independent variables) are included in the data sample.

Overfitting; features

How well did you know this?

Not at all

Perfectly

In overfitting, ___________ is misperceived to be a pattern, resulting in high __-_____ __-________.

randomness; in-sample R-squared

How well did you know this?

Not at all

Perfectly

Overfitting has occurred when the noise in the ________ ________ seems to improve the model fit.

target variables

How well did you know this?

Not at all

Perfectly

Overfit models do not _________ ________ to new data (i.e., ____-___-____ ___-______ will be low).

generalize well ; out-of-sample R-squared

How well did you know this?

Not at all

Perfectly

You “train” the model using __-_____ data.

in-sample

How well did you know this?

Not at all

Perfectly

When a model _______ ______, it means that the model retains its explanatory power when it is applied to new, (i.e., ___-__-_____) data.

generalizes well; out-of-sample

How well did you know this?

Not at all

Perfectly

To measure how well a model generalizes, data analysts create three non-overlapping data sets. List them and provide a brief explanation of each.

Study These Flashcards

(1) training sample = used to develop the model. (2) validation sample = used for tuning the model. (3) test sample = used for evaluating the model using new data.

In-sample prediction errors occur with the ________ sample, while prediction errors in the ______ and ______ samples are known as out-of-sample errors.

Study These Flashcards

training validation and test

Which supervised learning model is most appropriate (1) when the Y-variable is continuous and (2) when the Y-variable is categorical Continuous Y-variable Categorical Y-variable A) Decision trees Regression B) Classification Neural Networks C) Regression Classification

Study These Flashcards

When the Y-variable is continuous, the appropriate approach is that of regression (used in a broad, ML context). When the Y-variable is categorical (i.e., belonging to a category or classification) or ordinal (i.e., ordered or ranked), a classification model is used. C

In machine learning, out-of-sample error equals: A) forecast error plus expected error plus regression error. B) Standard error plus data error plus prediction error. C) bias error plus variance error plus base error.

Study These Flashcards

Out-of-sample error equals bias error plus variance error plus base error. Bias error is the extent to which a model fits the training data. Variance error describes the degree to which a model’s results change in response to new data from validation and test samples. Base error comes from randomness in the data. C

Overfitting is least likely to result in: A) higher forecasting accuracy in out-of-sample data. B) higher number of features included in the data set. C) inclusion of noise in the model.

Study These Flashcards

Overfitting results when a large number of features (i.e., independent variables) are included in the data sample. The resulting model can use the “noise” in the dependent variables to improve the model fit. Overfitting the model in this way will actually decrease the accuracy of model forecasts on other (out-of-sample) data. A

Which of the following about unsupervised learning is most accurate? A) There is no labeled data. B) Classification is an example of unsupervised learning algorithm. C) Unsupervised learning has lower forecasting accuracy as compared to supervised learning.

Study These Flashcards

In unsupervised learning, the ML program is not given labeled training data. Instead, inputs are provided without any conclusions about those inputs. In the absence of any tagged data, the program seeks out structure or inter-relationships in the data. Clustering is one example of the output of unsupervised ML program while classification is suited for supervised learning. A

Which of the following statements about supervised learning is most accurate? A) Typical data analytics tasks for supervised learning include classification and prediction. B) Supervised learning requires human intervention in machine learning process. C) Supervised learning does not differentiate between tag and features.

Supervised learning utilizes labeled training data to guide the ML program but does not need "human intervention." Typical data analytics tasks for supervised learning include classification and prediction. A

The technique in which a machine learns to model a set of output data from a given set of inputs is best described as: A) deep learning. B) supervised learning. C) unsupervised learning.

Supervised learning is a machine learning technique in which a machine is given labelled input and output data and models the output data based on the input data. In unsupervised learning, a machine is given input data in which to identify patterns and relationships, but no output data to model. Deep learning is a technique to identify patterns of increasing complexity and may use supervised or unsupervised learning. B

A rudimentary way to think of machine learning algorithms is that they: A) “synthesize the pattern, review the pattern.” B) “find the pattern, apply the pattern.” C) “develop the pattern, interpret the pattern.”

One elementary way to think of ML algorithms is to "find the pattern, apply the pattern."Machine learning attempts to extract knowledge from large amounts of data by learning from known examples in order to determine an underlying structure in the data. The focus is on generating structure or predictions without human intervention. B

The degree to which a machine learning model retains its explanatory power when predicting out-of-sample is most commonly described as: A) predominance. B) generalization. C) hegemony.

Generalization describes the degree to which, when predicting out-of-sample, a machine learning model retains its explanatory power. B

Which statement about target variables is most accurate? A) They can be continuous, ordinal, or categorical. B) They are not specified for supervised learning. C) They refer to independent variables.

Target variables (i.e., dependent variables) can be continuous, ordinal, or categorical. Target variables are not specified for unsupervised learning. A

Which statement most accurately describes supervised learning? A) It uses labeled training data. B) It requires periodic human intervention. C) It is best suited for classification.

Supervised learning uses labeled training data, and it does not need human intervention. Classification algorithms can be used for both supervised and unsupervised learning. A

A model that has poor in-sample explanatory power is most likely to have a high: A) bias error. B) variance error. C) base error.

Bias error is the in-sample error resulting from models with a poor fit. (LOS 7.b) A

The problem of overfitting a model would least appropriately be addressed by: A) imposing a penalty on included features that do not add to explanatory power of the model. B) using cross validation. C) using a smaller sample.

To reduce the problem of overfitting, data scientists use complexity reduction and cross validation. In complexity reduction, a penalty is imposed to exclude features that are not meaningfully contributing to out-of-sample prediction accuracy. C

Cross validation occurs when: A) training and validation samples change over the learning cycle. B) prediction is tested in another heterogeneous sample. C) the performance parameter is set by another algorithm.

In cross validation, the training and validation samples are randomly generated every learning cycle. (LOS 7.b) A

Out-of-sample error originates from what three sources?

(1) Bias error (2) Variance error (3) Base error

\_\_\_\_\_\_\_\_ error is the in-sample error resulting from models with a poor fit. Describe with a diagram in terms of accuracy rate and sample size.

Bias

\_\_\_\_\_\_\_ error is the out-of-sample error resulting from overfitted models that do not generalize well. Describe with a diagram in terms of accuracy rate and sample size. .

Variance

Use a diagram to describe a robust model in terms of accuracy rate and sample size.

Robust model:

\_\_\_\_\_\_ error is the residual errors due to random noise (the randomness in the data).

Base

Is there any effective way to determine the three individual error components within the overall error of a model? What are these three components?

No. (1) Bias error (2) Variance error (3) Base error

\_\_\_\_\_\_\_\_ error increases with complexity, while ______ error decreases with complexity.

Variance; bias

To reduce the problem of overfitting, data scientists use ________ \_\_\_\_\_\_\_\_ and ______ \_\_\_\_\_\_\_\_\_\_\_.

complexity reduction; cross validation

In _______ \_\_\_\_\_\_\_, a penalty is imposed to exclude features that are not meaningfully contributing to out-of-sample prediction accuracy.

complexity reduction

In ______ \_\_\_\_\_\_\_, the penalty value increases with the number of independent variables (features) used by the model.

complexity reduction

The sampling technique known as _____ \_\_\_\_\_\_\_\_ estimates out-of-sample error rates directly from the validation sample.

cross validation

In a \_\_-\_\_\_ _____ \_\_\_\_\_\_, the validation sample is randomly divided into "k" parts in order to estimate the out-of-sample error rates. The ________ sample comprises (k - 1) parts, with one part left for \_\_\_\_\_\_\_\_\_. ______ is then measured for the model in each of the parts. this process is repeated ___ times, and the average in-sample and out-of-sample error rates are compiled.

k-fold cross validation; training; validation; Error; "k"

The objective of ML is to determine _______ or generate ________ without ______ \_\_\_\_\_\_\_\_\_\_.

structure; forecasts; human intervention

7.1 - Types of Learning and Overfitting Problems Flashcards

(46 cards)