Quiz #3 Flashcards
Exam prep
Features with a large number of distinct values will have lower intrinsic value than features with a small number of distinct values.
True
False
False
In binomial logistic regression, the best cut-off point is always at 0.5.
True
False
False
Rather than the sum of squares used in linear regression, in logistic regression, the coefficients are estimated using a technique called \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_. A. Mean Estimation of Means B. Gradient Boosting C. Maximum Likelihood Estimation D. Maximum Logistic Error
C. Maximum Likelihood Estimation
Given a set of candidate models from the same data, the model with the highest AIC is usually the “preferred” model.
True
False
False
Regression establishes causation between the independent variables and the dependent variable.
True
False
False
The recursive process used in logistic regression to minimize the cost function during maximum likelihood estimation is known as _________.
A. logit function
B. sum of squared errors
C. log odds
D. gradient descent
D. gradient descent
k-NN is an example of a \_\_\_\_\_\_\_\_ model. A. parametric B. metric C. unsupervised D. non-parametric
D. non-parametric
Lazy learners such as k-Nearest Neighbor are also known as \_\_\_\_\_\_\_ learners. A. rote learners B. just-in-time learners C. non-learners D. instance-based learners
A. rote learners
For decision trees, entropy is a quantification of the level of \_\_\_\_\_\_\_\_\_\_\_ within a set of class values. A. randomness B. static C. disorder D. nodes
A. randomness
A common type of distance measure used in k-NN is the \_\_\_\_\_\_\_\_\_ distance. A. Euclidean B. Nearest C. Bayesian D. Eucalyptus
A. Euclidean
Which of these is not a common approach to choosing the right value for K?
A. Use weighted voting where the closest neighbors have larger weights.
B. Half the number of training examples.
C. Test different k values against a variety of test datasets and chose the one that performs best.
D. The square root of the number of training examples.
B. Half the number of training examples.
The link function used for binomial logistic regression is called the \_\_\_\_\_\_\_\_\_\_\_\_\_. A. logit function B. logarithmic function C. inverse function D. logos function
A. logit function
Entropy is highest when the split is 50-50. However, as one class dominates the other, entropy reduces to zero.
True
False
True
In order to choose the next feature to split on, a decision tree learner calculates the Information Gain for features A, B, C and D as 0.022, 0.609, 0.841 and 0.145 respectively. Which feature will it next choose to split on? A. Feature D B. Feature C C. Feature B D. Feature A
B. Feature C
Decision tree learners typically output the resulting tree structure in human-readable format. This makes them well suited for applications that require transparency for legal reasons or for knowledge transfer.
True
False
True
Before we use k-NN, what can we do if we have significant variance in the range of values for our features? A. We convert them all to 0s and 1s. B. We normalize the data. C. We exclude the outlier features. D. We create dummy variables.
B. We normalize the data.
The K in k-NN has to do with ______________.
A. The number of unlabeled observations with the letter K.
B. The size of the training set.
C. The number of clusters that need to be created to properly label the unlabeled observation.
D. The number of labeled observations to compare with the unlabeled observation.
D. The number of labeled observations to compare with the unlabeled observation.
The logistic function is a sigmoid function that assumes values from ___ to ___ . Note: Your answer must be numeric.
0, 1
A small K makes a model susceptible to noise and/or outliers and can lead to \_\_\_\_\_\_\_\_\_\_\_. A. randomness B. underfitting C. overfitting D. error
C. overfitting
For decision trees, the process of remediating the size of a tree in order for it to generalize better is known as \_\_\_\_\_\_\_\_\_\_\_. A. purging B. pruning C. planing D. partitioning
B. pruning