Lecture 7 Flashcards

Question 1

Q

Random Variable

Answer

A

Refers to an element/event whose status is unknown

Question 2

Q

Domain

Answer

A

The set of values a random variable can take

Question 3

Q

Conditional probability

Answer

A

the chances that some outcome occurs given that another event has also occurred

Question 4

Q

Joint Probability

Answer

A

The probability that a set of random variables will take a specific value

Question 5

Q

What are three types of classifiers?

Answer

A

Instance based classifiers, generative, and discriminative

Question 6

Q

Instance based classifiers

Answer

A

Use observations directly without models

e.g. k nearest neighbors

Question 7

Q

Generative classifiers

Answer

A

build a generative statistical model

e.g. Bayes classifiers

Question 8

Q

Discriminative classifiers

Answer

A

directly estimate a decision rule/boundary

e.g. decision trees

Question 9

Q

Gaussian Naive Bayes classifier

Answer

A

assumes that features follow a normal distribution

Question 10

Q

Multinomial Naive Bayes

Answer

A

each feature represents an integer count of something, like how often a word appears in a sentence

Question 11

Q

Bernoulli Naive Bayes

Answer

A

Assumes your feature vectors are binary or continuous values which can be precisely split (binarized) with a predefined threshold

Question 12

Q

Advantages of Naive Bayes classifiers

Answer

A

They are simple, work well with a small amount of training data, and the class with the highest probability is considered as the most likely class

Question 13

Q

Disadvantage of Naive Bayes classifiers

Answer

A

Estimates parameters

Question 14

Q

What is the complexity of a decision tree model determined by?

Answer

A

the depth of the tree

Question 15

Q

What causes overfitting in decision trees?

Answer

A

Increasing the depth of the tree and thus increasing the number of decision boundries

Question 16

Q

What is the aim of Bayesian Linear Regression?

Answer

Study These Flashcards

A

Not to find the single “best’ value of the model parameters, but rather to determine the posterior distribution for the model parameters

Question 17

Q

Advantages of Decision Trees

Answer

Study These Flashcards

A

They are suitable for multiclass classification, the model is easily interpretable, it can handle numerical and categorical data, is non-linear, and can tolerate missing values

Question 18

Q

Disadvantages of Decision Trees

Answer

Study These Flashcards

A

They’re prone to overfitting without pruning, are weak learners, and singular trees do not make great predictions

Question 19

Q

Which classification algorithms that we know are linear?

Answer

Study These Flashcards

A

Logistic regression and

Linear SVMs

Question 20

Q

Which classification algorithms that we know are nonlinear?

Answer

Study These Flashcards

A

KNN
Neural Networks (Multi-Layer Perceptron)
Kernel SVMs
Naive Bayes
and Decision Trees

Question 21

Q

What does the vertical line | in probability theory refer to?

Answer

Study These Flashcards

A

given

e.g. p(A|B) is the probability of A given B.

Lecture 7 Flashcards

(21 cards)