Quiz #3 Flashcards

Exam prep

1
Q

Features with a large number of distinct values will have lower intrinsic value than features with a small number of distinct values.
True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In binomial logistic regression, the best cut-off point is always at 0.5.
True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
Rather than the sum of squares used in linear regression, in logistic regression, the coefficients are estimated using a technique called \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_.
  A. Mean Estimation of Means
  B. Gradient Boosting
  C. Maximum Likelihood Estimation
  D. Maximum Logistic Error
A

C. Maximum Likelihood Estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Given a set of candidate models from the same data, the model with the highest AIC is usually the “preferred” model.

True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Regression establishes causation between the independent variables and the dependent variable.
True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The recursive process used in logistic regression to minimize the cost function during maximum likelihood estimation is known as _________.

A. logit function
B. sum of squared errors
C. log odds
D. gradient descent

A

D. gradient descent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
k-NN is an example of a \_\_\_\_\_\_\_\_ model.
  A. parametric
  B. metric
  C. unsupervised
  D. non-parametric
A

D. non-parametric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
Lazy learners such as k-Nearest Neighbor are also known as \_\_\_\_\_\_\_ learners.
  A. rote learners
  B. just-in-time learners
  C. non-learners
  D. instance-based learners
A

A. rote learners

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
For decision trees, entropy is a quantification of the level of \_\_\_\_\_\_\_\_\_\_\_ within a set of class values.
  A. randomness
  B. static
  C. disorder
  D. nodes
A

A. randomness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
A common type of distance measure used in k-NN is the \_\_\_\_\_\_\_\_\_ distance.
  A. Euclidean
  B. Nearest
  C. Bayesian
  D. Eucalyptus
A

A. Euclidean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which of these is not a common approach to choosing the right value for K?
A. Use weighted voting where the closest neighbors have larger weights.
B. Half the number of training examples.
C. Test different k values against a variety of test datasets and chose the one that performs best.
D. The square root of the number of training examples.

A

B. Half the number of training examples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
The link function used for binomial logistic regression is called the \_\_\_\_\_\_\_\_\_\_\_\_\_.
  A. logit function
  B. logarithmic function
  C. inverse function
  D. logos function
A

A. logit function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Entropy is highest when the split is 50-50. However, as one class dominates the other, entropy reduces to zero.
True
False

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
In order to choose the next feature to split on, a decision tree learner calculates the Information Gain for features A, B, C and D as 0.022, 0.609, 0.841 and 0.145 respectively. Which feature will it next choose to split on?
  A. Feature D
  B. Feature C
  C. Feature B
  D. Feature A
A

B. Feature C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Decision tree learners typically output the resulting tree structure in human-readable format. This makes them well suited for applications that require transparency for legal reasons or for knowledge transfer.
True
False

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
Before we use k-NN, what can we do if we have significant variance in the range of values for our features?
  A. We convert them all to 0s and 1s.
  B. We normalize the data.
  C. We exclude the outlier features.
  D. We create dummy variables.
A

B. We normalize the data.

17
Q

The K in k-NN has to do with ______________.
A. The number of unlabeled observations with the letter K.
B. The size of the training set.
C. The number of clusters that need to be created to properly label the unlabeled observation.
D. The number of labeled observations to compare with the unlabeled observation.

A

D. The number of labeled observations to compare with the unlabeled observation.

18
Q

The logistic function is a sigmoid function that assumes values from ___ to ___ . Note: Your answer must be numeric.

A

0, 1

19
Q
A small K makes a model susceptible to noise and/or outliers and can lead to \_\_\_\_\_\_\_\_\_\_\_.
  A. randomness
  B. underfitting
  C. overfitting
  D. error
A

C. overfitting

20
Q
For decision trees, the process of remediating the size of a tree in order for it to generalize better is known as \_\_\_\_\_\_\_\_\_\_\_.
  A. purging
  B. pruning
  C. planing
  D. partitioning
A

B. pruning