Midterm Practice Problems Flashcards

Question 1

Q

Describe what the “Bayes classifier” is

Answer

A

The Bayes classifier is a classification procedure that reaches the true minimum misclassification rate. It can be thought of as the underlying model that generates the true categories of the observations.

Question 2

Q

Will the Bayes Classifier result in 0 misclassifications?

Answer

A

Whilethere may be some cases where data is extremely well-separated and thus the Bayes classifier results in 0misclassifications, in general this is not expected

Question 3

Q

Under what assumptions is Linear Discriminant Analysis the Bayes classifier?

Answer

A

If each group (or subpopulation) is assumed to be mul-tivariate normally distributed, and all groups have a common covariance matrix

Question 4

Q

Under what assumptions is Quadratic Discriminant Analysis the Bayes classifier?

Answer

A

If each group is assumed to be multivariate normally distributed with uniquecovariance matrices

Question 5

Q

What is clustering?

Answer

A

Clustering is attempting to separate observations into groups according to the predictors (X) — there is noknown response (Y) that we are actively modelling, it is an exploratory procedure.

Question 6

Q

What is classification

Answer

A

Classification is the process of fitting a model using predictors (X) to predict a categorical response variable(Y)

Question 7

Q

Whats the difference between clustering and classification?

Question 8

Q

Example of Clustering

Question 9

Q

Example of Classification

Question 10

Q

What is a p-value

Answer

A

The p-value for a hypothesis test is the probability of observing a test statistic as extreme, or more extreme(in the direction of the alternative hypothesis), than that which we observed assuming the null hypothesisis true.

Question 11

Q

Suggest a way of finding the ‘best’ number of groups (k) for a data set usingk-means.

Answer

A

Runk-means for all reasonable number of groups we might wish to consider, and record the total within-group sum of squares. View those values graphically, and determine at whichkincreasing the numberof groups further shows little improvement.

Question 12

Q

what is Linear Discriminant Analysis

Answer

A

a method used to find a linear combination of features that characterizes or separates two or more classes of objects or events

Question 13

Q

what is Quadratic Discriminant Analysis

Answer

A

a method used to determine which variables discriminate between two or more naturally occurring groups, it may have a descriptive or a predictive objective.

Question 14

Q

What are LDA and QDA used for?

Answer

A

statistical learning methods used for classifying observations to a class or category

Question 15

Q

what is the response variable to LDA and QDA used for?

Answer

A

categorical

Question 16

Q

Suppose we knew that the Bayes’ classifier was a boundary that was extremely non-linear.If we were usingK-nearest neighbours as a classifier, would you expect a larger or smallervalue ofKto provide a better approximation of the boundary?

Answer

Study These Flashcards

A

As k increases for k-nearest neighbours, we will see increasingly simple (linear looking) boundaries. Therefore , if we have a complicated (extremely non-linear) boundary, then we would expect a relatively small value of k to perform better.

Midterm Practice Problems Flashcards

(16 cards)