7.1 - Types of Learning and Overfitting Problems Flashcards
_______ _________ uses labeled training data to guide the ML program toward superior forecasting accuracy.
Supervised learning
Typical tasks for supervised learning include _________ and __________.
classification ; regression
If the target variable is ___________, the model involved is a regression model.
continuous
Multiple regression is an example of ________ learning.
supervised
Classification models are used in cases where the target variable is ________ or _________ (e.g., ranking).
categorical ; ordinal
Algorithms can be designed for __________ classification (e.g., classifying companies as likely to default vs. not likely to default) or _____________ classification (e.g., a ratings class for bonds).
binary ; multicategory
In ____________ learning, the ML program is not given labeled training data; instead, inputs (i.e., _________) are provided without any _____________ about those inputs.
unsupervised ; features ; conclusions
___________ is an example of an unsupervised ML program.
Clustering
In __________ learning, in the absence of any ________ variable, the program seeks out structure or interrelationships in the data.
unsupervised ; target
_______ _________ algorithms are used for complex tasks such as facial (image) recognition, natural language processing, etc.
Deep learning
Programs that learn from their own prediction errors are called _________ ____________ algorithms.
reinforced learning
Deep learning algorithms and reinforced learning algorithms are based on ________ ________ , a group of ML algorithms applied to problems with significant _____________ .
neural networks; nonlinearities
__________ is an issue with supervised ML that results when a large number of ________ (i.e., independent variables) are included in the data sample.
Overfitting; features
In overfitting, ___________ is misperceived to be a pattern, resulting in high __-_____ __-________.
randomness; in-sample R-squared
Overfitting has occurred when the noise in the ________ ________ seems to improve the model fit.
target variables
Overfit models do not _________ ________ to new data (i.e., ____-___-____ ___-______ will be low).
generalize well ; out-of-sample R-squared
You “train” the model using __-_____ data.
in-sample
When a model _______ ______, it means that the model retains its explanatory power when it is applied to new, (i.e., ___-__-_____) data.
generalizes well; out-of-sample
To measure how well a model generalizes, data analysts create three non-overlapping data sets. List them and provide a brief explanation of each.
(1) training sample = used to develop the model. (2) validation sample = used for tuning the model. (3) test sample = used for evaluating the model using new data.
In-sample prediction errors occur with the ________ sample, while prediction errors in the ______ and ______ samples are known as out-of-sample errors.
training validation and test
Which supervised learning model is most appropriate (1) when the Y-variable is continuous and (2) when the Y-variable is categorical Continuous Y-variable Categorical Y-variable A) Decision trees Regression B) Classification Neural Networks C) Regression Classification
When the Y-variable is continuous, the appropriate approach is that of regression (used in a broad, ML context). When the Y-variable is categorical (i.e., belonging to a category or classification) or ordinal (i.e., ordered or ranked), a classification model is used. C
In machine learning, out-of-sample error equals: A) forecast error plus expected error plus regression error. B) Standard error plus data error plus prediction error. C) bias error plus variance error plus base error.
Out-of-sample error equals bias error plus variance error plus base error. Bias error is the extent to which a model fits the training data. Variance error describes the degree to which a model’s results change in response to new data from validation and test samples. Base error comes from randomness in the data. C
Overfitting is least likely to result in: A) higher forecasting accuracy in out-of-sample data. B) higher number of features included in the data set. C) inclusion of noise in the model.
Overfitting results when a large number of features (i.e., independent variables) are included in the data sample. The resulting model can use the “noise” in the dependent variables to improve the model fit. Overfitting the model in this way will actually decrease the accuracy of model forecasts on other (out-of-sample) data. A
Which of the following about unsupervised learning is most accurate? A) There is no labeled data. B) Classification is an example of unsupervised learning algorithm. C) Unsupervised learning has lower forecasting accuracy as compared to supervised learning.
In unsupervised learning, the ML program is not given labeled training data. Instead, inputs are provided without any conclusions about those inputs. In the absence of any tagged data, the program seeks out structure or inter-relationships in the data. Clustering is one example of the output of unsupervised ML program while classification is suited for supervised learning. A