Midterm Qs Flashcards

1
Q

Logistic Regression assumes that

A

the log odds of response categories are linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

O

A

O

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Distances between observations measured on mixed (categorical, continuous and binary) variables can be calculated using

A

Gower’s distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Asymmetric binary distance ignores

A

0-0 matches

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The bootstrap, and its most common use, can be best summarized as

A

Repeated sampling from the data with replacement, fitting a model to those samples and investigating the changes in parameter estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

THe Jackknife method could be summarized as

A

Application of CV to estimation of standard errors and bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

R function to fit Log Regression

A

glm()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following statements generally hold tru about testing and training sets?

The logloss of the test set equals the logos of the training set

The logos of the test set is less than the logos of the training set

the logos of the test set us larger than the logloss of the training set

The logos of the test set does not generally have any relationship with the logloss of the training set

A

the logos of the test set us larger than the logloss of the training set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A simple model is most at risk at suffering from high…

A

bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A flexible model is most at risk at suffering from high…

A

variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In multiple Linear Regression the r^2 value provides the…

A

total amount of variation in the response variable explained by the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Q. 12, peep the pic

A

g

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

d

A

d

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a p-value

A

the probability of observing a test statistic as more extreme than that which we observed, assuming the null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

look at pic

A

d

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which of the following is True about hierarchical clustering?

Resulting cluster memberships depend on random starting points (non-deterministic)

Resulting cluster memberships do not depend on the scale of the data (scale invariant)

Resulting cluster memberships do not depend on the distance-measure matrix

Resulting cluster memberships do not depend on a chosen linkage method

A

Resulting cluster memberships do not depend on a chosen linkage method

17
Q

Which of the following statements is FALSE about k-means clustering

Resulting clusters memberships depend on random starting point (non-deterministic)

Resulting cluster memberships depend on the scale of the data (scale invariant)

Resulting cluster memberships provide the global maxima for the within group sum of squares

Resulting cluster memberships do not depend on a chosen linkage method

A

Resulting cluster memberships do not depend on a chosen linkage method