Midterm Qs Flashcards

Question 1

Q

Logistic Regression assumes that

Answer

A

the log odds of response categories are linear

Question 2

Q

O

Question 3

Q

Distances between observations measured on mixed (categorical, continuous and binary) variables can be calculated using

Answer

A

Gower’s distance

Question 4

Q

Asymmetric binary distance ignores

Answer

A

0-0 matches

Question 5

Q

The bootstrap, and its most common use, can be best summarized as

Answer

A

Repeated sampling from the data with replacement, fitting a model to those samples and investigating the changes in parameter estimates

Question 6

Q

THe Jackknife method could be summarized as

Answer

A

Application of CV to estimation of standard errors and bias

Question 7

Q

R function to fit Log Regression

Question 8

Q

Which of the following statements generally hold tru about testing and training sets?

The logloss of the test set equals the logos of the training set

The logos of the test set is less than the logos of the training set

the logos of the test set us larger than the logloss of the training set

The logos of the test set does not generally have any relationship with the logloss of the training set

Answer

A

the logos of the test set us larger than the logloss of the training set

Question 9

Q

A simple model is most at risk at suffering from high…

Question 10

Q

A flexible model is most at risk at suffering from high…

Question 11

Q

In multiple Linear Regression the r^2 value provides the…

Answer

A

total amount of variation in the response variable explained by the model

Question 12

Q

Q. 12, peep the pic

Question 13

Q

d

Question 14

Q

What is a p-value

Answer

A

the probability of observing a test statistic as more extreme than that which we observed, assuming the null hypothesis is true

Question 15

Q

look at pic

Question 16

Q

Which of the following is True about hierarchical clustering?

Resulting cluster memberships depend on random starting points (non-deterministic)

Resulting cluster memberships do not depend on the scale of the data (scale invariant)

Resulting cluster memberships do not depend on the distance-measure matrix

Resulting cluster memberships do not depend on a chosen linkage method

Answer

Study These Flashcards

A

Resulting cluster memberships do not depend on a chosen linkage method

Question 17

Q

Which of the following statements is FALSE about k-means clustering

Resulting clusters memberships depend on random starting point (non-deterministic)

Resulting cluster memberships depend on the scale of the data (scale invariant)

Resulting cluster memberships provide the global maxima for the within group sum of squares

Resulting cluster memberships do not depend on a chosen linkage method

Answer

Study These Flashcards

A

Resulting cluster memberships do not depend on a chosen linkage method

Midterm Qs Flashcards

(17 cards)