Module 5 Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Decision trees can be used for classification. True or False

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why do we prefer information gain over accuracy when splitting?

Answers:
a.
Decision Tree is prone to over-fit and accuracy doesn’t help to generalize

b.
Information gain is more stable as compared to accuracy

c.
Information gain chooses more impactful features closer to root

d.
All of these

A

d.
All of these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which of the following is true about an individual tree in random forest classifier?
1. it is built on a subset of the features
2. it is built on all the features
3. it is built on a subset of observations
4. it is built on full set of observations

Answers:
a.
2 and 4

b.
1 and 4

c.
1 and 3

d.
2 and 3

A

c.
1 and 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of the following are the disadvantage of decision tree classifiers?

Answers:
a.
not easy to interpret

b.
not a very stable algorithm

c.
overfit the data easily if it perfectly memorizes it

d.
Both b and c

A

d.
Both b and c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Imagine a two variable predictor space having 10 data points. A decision tree is built over it with 5 leaf nodes. The number of distinct regions that will be formed in predictors space?

a.
5

b.
2

c.
25

d.
10

A

a. 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Random forests can be used to classify infinite dimensional data. True or False

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which of the following is correct with respect to random forest?

Answers:
a.
Random forest are difficult to interpret but often very accurate

b.
Random forest are easy to interpret but often very accurate

c.
Random forest are difficult to interpret but very less accurate

d.
None of the above

A

a.
Random forest are difficult to interpret but often very accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following is selected by the classifier when using a random forest classifier? (Assume the use of default parameters is selected by the user)

Answers:
a.
Number of decision trees

b.
features to be taken into account when building a tree

c.
samples to be given to train individual tree in a forest

d.
b and c

e.
a, b, and c

A

d.
b and c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

You are given two independent variables X and Y with P(x)=0.3, P(y and x) = 0.2. What is the probability P(y|x)?

Answers:
a.
0.5

b.
2/3

c.
1/6

d.
1/3

A

b.
2/3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following statements about Naive Bayes is incorrect?

Selected Answer:
a.
Features are statistically dependent of one another given the class value.

b.
Features can be nominal or numeric.

c.
Features are statistically independent of one another given the class value.

d.
Features are equally important.

A

a.
Features are statistically dependent of one another given the class value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly