7.2 - Supervised Learning Algorithms Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

______ _______ models reduce the problem of ______ by imposing a penalty based on the number of features (i.e., _________ variables) used by the model.

A

Penalized regression;
overfitting;
independent;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In a penalized regression model, the penalty value _________ with the number of independent variables (i.e., ______) used.

A

increases;

features;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Imposing a penalty on the number of features makes a model more _________.

A

parsimonious;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Penalized regression models seek to minimize the sum __ _____ _____ (___), as well as the magnitude of the _____ _____.

A

of squared errors (SSE);

penalty value;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

LASSO stands for what?

A

Least absolute shrinkage and selection operator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In addition to minimizing SSE, LASSO minimizes what?

A

The sum of the absolute values of the slope coefficients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

LASSO automatically eliminates what?

A

The least predictive features of a regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In a LASSO model, the penalty term, referred to by the Greek letter ______, is the __________ that determines the balance between _________ the model and keeping it _________.

A

lambda;
hyperparameter;
overfitting;
parsimonious;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In a LASSO model, you want both the _____ and _______ (the value of the penalty) to be low.

A

SSE;

lambda;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A method related to LASSO that is used to reduce statistical variability in a high dimension data estimation problem is referred to as __________. This method forces the ______ ________ of nonperforming features toward _______.

A

regularization;
beta coefficients;
zero;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does SVM stand for?

A

support vector machine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

SVM is a linear classification algorithm that does what?

A

SVM separates the data into one of two possible classifiers (e.g., sell vs. buy).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does KNN stand for?

A

K-nearest neighbor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

KNN is more commonly used in _________ (but sometimes in ________).

A

classification;

regression;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

KNN is used to classify an observation based on _________ to the observations in the __________ ________.

A

nearness;

training sample;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In KNN, the researcher specifies the value of __, the __________, triggering the algorithm to look for the __ _________ in the sample that are closest to the new observation that is being _____________.

A

k;
hyperparameter;
k observations;
classified;

17
Q

As an example of KNN, if k=5, the algorithm will look for what?

A

The 5 nearest neighbors i.e., the 5 most similar observations).

18
Q

What are some investment applications of KNN?

A

Predicting bankruptcy; assigning a bond to a ratings class; predicting stock prices; creating customized indices;

19
Q

What does CART stand for?

A

Classification and regression trees.

20
Q

Classification trees (aka “____”) are appropriate when the ______ _____ is __________, and are typically used when the target is ________ (e.g., an IPO will be successful vs. not successful).

A

CART;
target variable;
categorical;
binary;

21
Q

Classification trees contain two types of _____: (1) __________ ____ and (2) ________ _____. They repeatedly divide the data until a _______ _____ is reached becomes the end of each branch.

A

nodes;
decision nodes;
terminal nodes;
terminal;

22
Q

What are some investment applications of CART?

A

detecting fraudulent financial statements (e.g., these are likely manipulators vs. these are likely not manipulators); selecting stocks and bonds;

23
Q

________ _______ is the technique of combining predictions from multiple models rather than from a single model.

A

Ensemble learning;

24
Q

The purpose of using _______ ________ models is that an individual model will have a certain error rate and will make _____ _________. But by ________ predictions from many models, you can reduce the _______.

A

ensemble learning;
noisy predictions;
averaging;
noise;

25
Q

In “________”, the ________ training set is used to generate “__” training data sets or “____” of data. Each new ___ is generated by ______ ________ with ___________. _________ helps to improve the stability of predictions and protects against overfitting the model.

A
bagging; 
original;
bags; 
random sampling;
replacement; 
Bagging:
26
Q

_______ _______ is a variant of classification trees whereby a large number of classification trees are trained using data __________ from the same data set.

A

Random forest;

bagged;